Upload CSV files

Introduction

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format.

The CSV file format is supported by almost all spreadsheets and database management systems, including Microsoft Excel, Apple Numbers, LibreOffice Calc, and Apache OpenOffice Calc.

The CSV Uploader Service allows importing data from a CSV file into a MongoDB collection.

This RESTHeart service is bound to the /csv API resource by default.

By uploading a CSV file you create or update one document per each row of the file.

Before running the example requests

The following examples assume the RESTHeart Platform running on localhost with the default configuration: the database restheart is bound to / and the user admin exists with default password secret.

To create the restheart db, run the following:

PUT / HTTP/1.1

Let’s create a poi collection, run the following:

PUT /poi HTTP/1.1

Upload the CSV file

We are going to use the following example file POI.csv:

id,name,city,lat,lon,note
1,Coliseum,Rome,41.8902614,12.4930871,Also known as the Flavian Amphitheatre
2,Duomo,Milan,45.464278,9.190596,Milan Cathedral

To import the POI.csv into the collection poi, run the following:

POST /csv?db=restheart&coll=poi&id=0 HTTP/1.1
Content-Type: text/csv

id,name,city,lat,lon,note
1,Coliseum,Rome,41.8902614,12.4930871,Also known as the Flavian Amphitheatre
2,Duomo,Milan,45.464278,9.190596,Milan Cathedral

The /csv path is a reserved path, used by the RESTHeart CSV Uploader Service

Note that the Content-Type must be text/csv otherwise you’ll get a 400 Bad Request error.

Now the /poi collection contains the documents:

GET /poi HTTP/1.1
Response
[
    {
        "_id": 2,
        "city": "Milan",
        "lat": 45.464278,
        "lon": 9.190596,
        "name": "Duomo",
        "note": "Milan Cathedral",
        "_etag": {
            "$oid": "5d249beebb77e333b6dc9c84"
        }
    },
    {
        "_id": 1,
        "city": "Rome",
        "lat": 41.8902614,
        "lon": 12.4930871,
        "name": "Coliseum",
        "note": "Also known as the Flavian Amphitheatre",
        "_etag": {
            "$oid": "5d249beebb77e333b6dc9c83"
        }
    }
]

Query parameters

The CSV uploader service is controlled by the following query parameters.

query parameter description default value
db (required) the name of the database no default
coll (required) the name of the collection no default
id id column index no id column
sep column separator ,
transformer name of a transformer to apply to imported data no transformer
update if true, update matching documents false
upsert applies when update=true; if true, create new document if no documents match the _id true
props additional properties to add to each row, e.g. ?props=foo&props=bar no props
values values of additional properties to add to each row e.g. ?values=1&values=2 no values

If the id parameter is not specified, a document is created with a new ObjectId per each CSV row.

Update documents from CSV

If the CSV lines are changed or new ones are added, you can update your collection with the update and upsert parameters.

To update your collection use the update parameter.

New lines in the CSV will NOT be added.

POST /csv?db=restheart&coll=poi&id=0&update=true HTTP/1.1
Content-Type: text/csv

id,name,city,lat,lon,note
1,Coliseum,Rome,41.8902614,12.4930871,Also known as the Flavian Amphitheatre -UPDATED-
2,Duomo,Milan,45.464278,9.190596,Milan Cathedral -UPDATED-
3,Cattedrale di Santa Maria del Fiore,43.773251,11.255474,Florence Cathedral

To update existing documents and add new ones, add the upsert=true query parameter:

POST /csv?db=restheart&coll=poi&id=0&update=true&upsert=true HTTP/1.1
Content-Type: text/csv

id,name,city,lat,lon,note
1,Coliseum,Rome,41.8902614,12.4930871,Also known as the Flavian Amphitheatre
2,Duomo,Milan,45.464278,9.190596,Milan Cathedral
3,Cattedrale di Santa Maria del Fiore,43.773251,11.255474,Florence Cathedral

Apply a transformer

To apply a transformer use the transformer query parameter.

The CSV format allows creating flat documents. The transformer modifies the request body so that we can take advantage of the nested nature of JSON.

Request
POST /csv?db=restheart&coll=poi&id=0&update=true&transformer=GeoJSONTransformer HTTP/1.1
Content-Type: text/csv

id,name,city,lat,lon,note
1,Coliseum,Rome,41.8902614,12.4930871,Also known as the Flavian Amphitheatre
2,Duomo,Milan,45.464278,9.190596,Milan Cathedral

The GeoJSONTransformer is the name of a custom transformer that must me packaged with RESTHeart. It transforms the latitude and longitude coordinates into a GeoJson object.

Check Package RESTHeart Core plugins to know how to package the custom transformer.

@RegisterPlugin(name = "GeoJSONTransformer", description = "Transform the x,y coordinate in GeoJSON object ")
public class GeoJSONTransformer implements Transformer {

    @Override
    public void transform(final HttpServerExchange exchange, final RequestContext context, BsonValue contentToTransform, final BsonValue args) {
        var body = contentToTransform.asDocument();

        // get Coordinates
        var coordinates = new BsonArray();
        coordinates.add(body.get("lon"));
        coordinates.add(body.get("lat"));

        var point = new BsonDocument();

        point.put("type", new BsonString("Point"));
        point.put("coordinates", coordinates);

        // Add the object to the document
        body.append("point", point);
    }
}

Now the documents have the new property point with the GeoJSON Point object:

[
    {
        "point": {
            "coordinates": [
                9.190596,
                45.464278
            ],
            "type": "Point"
        }
        "_etag": {
            "$oid": "5d2c40021861f94794721285"
        },
        "_id": 2,
        "city": "Milan",
        "lat": 45.464278,
        "lon": 9.190596,
        "name": "Duomo",
        "note": "Milan Cathedral"
    },
    {
        "point": {
            "coordinates": [
                12.4930871,
                41.8902614
            ],
            "type": "Point"
        },
        "_etag": {
            "$oid": "5d2c40021861f94794721284"
        },
        "_id": 1,
        "city": "Rome",
        "lat": 41.8902614,
        "lon": 12.4930871,
        "name": "Coliseum",
        "note": "Also known as the Flavian Amphitheatre"
    }
]