Binary Files with GridFS
Introduction
RESTHeart is a very good fit for headless CMS use cases, as it allows to effectively manage and aggregate content and its metadata, such as images, audios and videos, delivering them via a REST API.
RESTHeart offers complete binary files management. It’s possible to create, read and delete even huge files. RESTHeart makes use of MongoDB’s GridFS, a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB.
You can read more about GridFS here.
Note: RESTHeart’s HTTP clients must adhere to the multipart/form-data specification. However the specification allows a form to upload multiple files at the same time, but RESTHeart prohibits that: while it’s possible to POST / PUT several different non-binary parts for each file, it’s mandatory to upload one single binary file per each POST / PUT request. This limitation is necessary to univocally relate all the optional parts to the unique file.
In the examples in this section we assume RESTHeart is running on localhost
, port 8080
.
Create a File Bucket
File buckets are special collections whose aim is to store binary data and the associated metadata. Start by creating a new one for uploading binaries.
To be recognized as a file bucket the collection’s name must end with *.files postfix (i.e. mybucket.files is a valid file bucket name)
Create the default restheart
database, if none exists yet:
PUT / HTTP/1.1
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Location, ETag, X-Powered-By
Auth-Token: 1o6j8dt1f5y6jlu05t0blw2q4g280cgdv8253ilqhyoskoi5de
Auth-Token-Location: /tokens/admin
Auth-Token-Valid-Until: 2019-07-04T09:24:37.171231Z
Connection: keep-alive
Content-Length: 0
Content-Type: application/json
Date: Thu, 04 Jul 2019 09:09:37 GMT
ETag: 5d1dc2510951267987cf8ab1
X-Powered-By: restheart.org
Create the collection for hosting files.
PUT /mybucket.files HTTP/1.1
HTTP/1.1 201 Created
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Location, ETag, X-Powered-By
Auth-Token: 1o6j8dt1f5y6jlu05t0blw2q4g280cgdv8253ilqhyoskoi5de
Auth-Token-Location: /tokens/admin
Auth-Token-Valid-Until: 2019-07-04T09:26:41.654633Z
Connection: keep-alive
Content-Length: 0
Content-Type: application/json
Date: Thu, 04 Jul 2019 09:11:41 GMT
ETag: 5d1dc2cd0951267987cf8ab2
X-Powered-By: restheart.org
Uploading into file buckets with POST
To upload a binary file the request body must be encoded as multipart/form-data
.
A multipart/form-data encoded request is splitted and submitted into different sections. These are trivially called request parts. Each part is delimited by a random digits string boundary.
By setting Content-Type:multipart/form-data
, the following example request done using HTTPie will POST a binary image example.jpg
with {"author":"SoftInstigate"}
metadata associated into the previously created /mybucket.files
http -v --form POST localhost:8080/mybucket.files @/path/to/my/binary/example.jpg properties='{"author":"SoftInstigate"}'
will produce the following results:
POST /mybucket.files HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 66574
Content-Type: multipart/form-data; boundary=a54420e707704ef69099ae603c9acf4c
Host: localhost:8080
User-Agent: HTTPie/0.9.8
### Multipart flow example --- this will be done usually hidden and handled under-the-hood by your http client
--------------------------a54420e707704ef69099ae603c9acf4c
Content-Disposition: form-data; name="properties"
{"author":"SoftInstigate"}
--------------------------a54420e707704ef69099ae603c9acf4c
Content-Disposition: form-data; name="binary"; filename="example.jpg"
Content-Type: image/jpg
>>> contents of the file
--------------------------a54420e707704ef69099ae603c9acf4c--
HTTP/1.1 201 Created
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Location, ETag, X-Powered-By
Connection: keep-alive
Content-Length: 0
Content-Type: application/json
Date: Wed, 04 Mar 2020 15:48:40 GMT
Location: http://localhost:8080/mybucket.files/5e5fcdd8ddfa017301e00331
X-Powered-By: restheart.org
The Location
HTTP header returns the file’s location:
Location: http://localhost:8080/mybucket.files/5e5fcdd8ddfa017301e00331
Uploading with PUT
In order to explicit a human-readable name or to update an already exisiting asset you can use a PUT multipart/form-data
encoded request.
Note that the location contains the object ID automatically generated by
MongoDB (see the string 5e5fcdd8ddfa017301e00331
at the end of the
above Location URL). This is a unique identifier and it’s convenient in many
situation, but it’s not always desirable. In many case it would be
better to explicitly name the resource with something more readable and
meaningful.
PUT /mybucket.files/example HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 66574
Content-Type: multipart/form-data; boundary=12c5bba0f1014fca997aa0aeb9cd5094
Host: localhost:8080
User-Agent: HTTPie/0.9.8
HTTP/1.1 201 Created
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Location, ETag, X-Powered-By
Connection: keep-alive
Content-Length: 0
Content-Type: application/json
Date: Wed, 04 Mar 2020 16:25:57 GMT
X-Powered-By: restheart.org
In this case the resource has a much nicer URL:
http://localhost:8080/mybucket.files/example
Which is easier to read and link than the automatically generated name.
GET the file’s metadata
If you GET the Location
, RESTHeart actually returns the file’s metadata:
We have from the previous example:
GET http://localhost:8080/mybucket.files/5d1ef3d50951267987cf8ab4 HTTP/1.1
GET /mybucket.files/5e5fcdd8ddfa017301e00331 HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.9.8
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Location, ETag, X-Powered-By
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 244
Content-Type: application/json
Date: Wed, 04 Mar 2020 16:10:12 GMT
ETag: 5e5fcdd8ddfa017301e00330
X-Powered-By: restheart.org
{
"_id": {
"$oid": "5e5fcdd8ddfa017301e00331"
},
"_links": {
"rh:data": {
"href": "/mybucket.files/5e5fcdd8ddfa017301e00331/binary"
}
},
"chunkSize": 261120,
"filename": "file",
"length": 66273,
"md5": "6399742c9e8a662fd2e999d8d31653c0",
"metadata": {
"_etag": {
"$oid": "5e5fcdd8ddfa017301e00330"
},
"author": "SoftInstigate",
"contentType": "image/jpeg"
},
"uploadDate": {
"$date": 1583336920114
}
}
Uploaded files metadata can be filtered and treated as regular collection documents.
For example, http://localhost:8080/mybucket.files?filter={"metadata.author": "SoftInstigate"}
returns all the file metadata that satisfy the query.
GET the file’s binary data
The actual binary content is available by appending the binary
postfix, like this:
http://localhost:8080/mybucket.files/5e5fcdd8ddfa017301e00331/binary
The “properties” part
As we did in previus example, is possible to associate metadata to a file upload.
To add optional form data parts to the request it is necessary to embed the data into the properties
field name. The content of this field is automatically parsed as JSON data, so it must be valid JSON.
In the following example, we set the “author” and the “filename”:
http -v --form PUT localhost:8080/mybucket.files/example @/path/to/my/binary/example.jpg properties='{"author":"SoftInstigate", "filename":"example"}'
We then retrieve the metadata associated. The JSON will be merged into in the metadata section:
GET /mybucket.files/example HTTP/1.1
HTTP/1.1 200 OK
{
"_id": "example",
"_links": {
"rh:data": {
"href": "/mybucket.files/example/binary"
}
},
"chunkSize": 261120,
"filename": "example",
"length": 66273,
"md5": "6399742c9e8a662fd2e999d8d31653c0",
"metadata": {
"_etag": {
"$oid": "5e5fd9fcddfa017301e00335"
},
"author": "SoftInstigate",
"contentType": "image/jpeg",
"filename": "example"
},
"uploadDate": {
"$date": 1583340028302
}
}