Document Management API¶
This document describes the document management API endpoints that allow users to perform CRUD operations on documents stored in S3 buckets based on their access roles.
Authentication¶
All endpoints require authentication via Bearer token. The authenticated user's access roles determine which documents they can access and which operations they can perform.
Endpoints¶
1. Get User Documents¶
GET /documents
Returns a list of all documents accessible by the current user based on their access roles.
Response:
{
"documents": [
{
"id": 1,
"file_name": "example.pdf",
"document_path": "s3://bucket-role1/example.pdf",
"mime_type": "application/pdf",
"num_pages": 10,
"created_at": "2024-01-15T10:30:00Z",
"access_roles": ["ROLE1"]
}
],
"total_count": 1
}
2. Get Document by ID¶
GET /documents/{document_id}
Downloads the actual document file from S3 by document ID.
Parameters:
- document_id
(path): ID of the document to download
Response: - Binary file content with appropriate headers for file download - Returns 404 if document not found - Returns 403 if user doesn't have access to the document
3. Upload Document¶
POST /documents
Uploads a new document to S3 bucket associated with the specified access role.
Request Body (multipart/form-data):
- access_role
(form field): Target access role for the document
- file
(file): Document file to upload
Response:
{
"message": "File uploaded successfully",
"file_name": "example.pdf",
"additional_info": {
"bucket_name": "bucket-role1",
"object_key": "example.pdf",
"normalized_filename": "example.pdf",
"size": 12345
}
}
4. Update Document¶
PUT /documents/{document_id}
Updates an existing document by uploading a new version to S3.
Parameters:
- document_id
(path): ID of the document to update
Request Body (multipart/form-data):
- access_role
(form field): Access role context for the operation
- file
(file): New document file content
Response:
{
"message": "Document updated successfully",
"document_id": "1",
"file_name": "example.pdf",
"additional_info": {
"original_filename": "example_updated.pdf",
"size": 13456
}
}
5. Delete Document¶
DELETE /documents/{document_id}
Deletes a document from both S3 and the database.
Parameters:
- document_id
(path): ID of the document to delete
Request Body:
{
"access_role": "ROLE1"
}
Response:
{
"message": "Document deleted successfully",
"document_id": "1",
"file_name": "example.pdf"
}
Access Control¶
- Users can only access documents that have at least one access role in common with their assigned roles
- For upload operations, users must have the specific access role they're uploading to
- For update/delete operations, users must have the specific access role and the document must be assigned to that role
- All operations validate user permissions before proceeding
File Processing¶
- Uploaded files are automatically normalized (special characters, umlauts, spaces converted to underscores)
- Files are stored in S3 buckets specific to each access role (format:
{bucket_prefix}-{role}
) - The document ingestion service will automatically process uploaded files for search indexing
- Updated documents will be reprocessed by the ingestion service when detected
Error Responses¶
400 Bad Request
: Invalid request parameters or missing file401 Unauthorized
: Authentication required or invalid token403 Forbidden
: Access denied to specified role or document404 Not Found
: Document not found500 Internal Server Error
: Server-side processing error