layout: true class: animated fadeIn middle numbers .footnote[ `PSA` - N. Dubray - ENSIIE - 2024 - [:book:](../index.html) ] --- # MongoDB .hcenter.w40[] ## History * **10gen software** started the development in **2007** * Open source development model in **2009** (paid support and services) * **10gen software** renamed to **MongoDB inc.** in **2013** * October 2018, publicly-traded company (`MDB`) ## In an nutshell * is a **NoSQL** database program (document database) * can use schemas **or not** * is **scalable** and **replicable** (with load-balancing) * can execute javascript **server-side** (e.g. `mapreduce`) * can do **transactions** (02/2018) * one of the main databases for **Big Data** .block[ * Repository: [https://github.com/mongodb/mongo](https://github.com/mongodb/mongo) * Website: [https://www.mongodb.com](https://www.mongodb.com) * License: `AGPLv3` + `Apache` license ] --- # MongoDB - Characteristics ## Built-in capabilities :+: Queries: field, range, regexp, javaScript (server-side)... :+: Indexing: **optional**, primary and secondary keys :+: Replication: **automatic election** of a new master in case of failure :+: Load balancing: **sharding** uses a **shard key** to distribute data over `MongoDB` instances :+: File storage: `MongoDB` can act as a **load-balanced**, **replicated** file system (`GridFS`) :+: Aggregation: allows to perform e.g. `mapreduce` operations :+: Capped collections: some collections can be capped and become **circular queues** when filled :+: Transactions: `ACID` since `v4.0` --- # MongoDB - Scalability :arrow_right: How to increase the performances of a given `DBMS` ? .row[ .column.w65[ ## Vertical scaling :arrow_right: **Add resources to existing servers** :+: nothing to change in the servers config :warning: may cause some indisponibility :warning: is limited by the hardware ] .column.w30[ .mermaid[ graph TD A(Master) --> B(Slave 0) A(Master) --> C(Slave 1) A(Master) --> D(Slave 2) style A fill:#2A2 style B fill:#2A2 style C fill:#2A2 style D fill:#2A2 ] ] ] .row[ .column.w55[ ## Horizontal scaling :arrow_right: **Add more servers** :+: nothing to change in the servers config :+: can be transparent :+: is not limited by the hardware ] .column.w40[ .mermaid[ graph TD A(Master) --> B(Slave 0) A(Master) --> C(Slave 1) A(Master) --> D(Slave 2) A(Master) -.-> E(Slave 3) style E fill:#2A2 ] ] ] ## Horizontal scaling with `SQL`-type `DBMS` :warning: is often done **manualy** :warning: **may break transactions integrity** ## Horizontal scaling with `MongoDB` (**sharding**) :+: is a built-in mechanism :+: is almost transparent to the admin :+: can use different hardware --- # MongoDB - Schemas ? ## `SQL`-type `DBMS` :warning: a schema has to be defined **before inserting data** :warning: **new data has to follow the existing schema** :warning: to change the schema, existing data must be **migrated** .vspace[] ## `MongoDB` :+: schemas can be used **or not** :+: if no schema is used, new data can have **any attributes** :+: a schema can be created, changed or removed at any time, **no migration needed** :arrow_right: a schema can be used for **updating** or **inserting** data --- # MongoDB - Overview ## Data hierarchy .mermaid[ graph LR A(MongoDB instance) --> B B(Database) --> C C(Collection) --> D D(Document) --> E(Field) ] ## Example .tree.hcenter[ `MongoDB` instance * Database: `test_db` * Collection: `MusicRecords` * Document: * Field `"_id"`: `ObjectId("5b05cfb3c6256b28f54bf25e")` * Field `"band"`: `"Pink Floyd"` * Field `"song"`: `"Wish You Were Here"` * Document: * Field `"_id"`: `ObjectId("5b05cfb3c6256b28f54bf25f")` * Field `"band"`: `"Pink Floyd"` * Field `"song"`: `"Have a Cigar"` * Field `"year"`: `1975` * Database: `other_db` ] --- # MongoDB - Installation ## Ubuntu 17.10 ```shell *$ apt update Hit:1 http://fr.archive.ubuntu.com/ubuntu artful InRelease Hit:2 http://fr.archive.ubuntu.com/ubuntu artful-updates InRelease Hit:3 http://security.ubuntu.com/ubuntu artful-security InRelease Hit:4 http://fr.archive.ubuntu.com/ubuntu artful-backports InRelease Reading package lists... Done Building dependency tree Reading state information... Done All packages are up to date. *$ apt install mongodb-server Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libboost-chrono1.62.0 mongo-tools mongodb-clients The following NEW packages will be installed: libboost-chrono1.62.0 mongo-tools mongodb-clients mongodb-server 0 upgraded, 4 newly installed, 0 to remove and 3 not upgraded. Need to get 47.8 MB of archives. After this operation, 207 MB of additional disk space will be used. Do you want to continue? [Y/n] Get:1 http://fr.archive.ubuntu.com/ubuntu artful/main amd64 libboost-chrono1.62.0 amd64 1.62.0+dfsg-4build3 [11.8 kB] Get:2 http://fr.archive.ubuntu.com/ubuntu artful/universe amd64 mongo-tools amd64 3.2.11-1 [9,993 kB] Get:3 http://fr.archive.ubuntu.com/ubuntu artful/universe amd64 mongodb-clients amd64 1:3.4.7-1 [18.7 MB] Get:4 http://fr.archive.ubuntu.com/ubuntu artful/universe amd64 mongodb-server amd64 1:3.4.7-1 [19.2 MB] Fetched 47.8 MB in 20s (2,293 kB/s) [...] Setting up libboost-chrono1.62.0:amd64 (1.62.0+dfsg-4build3) ... Setting up mongo-tools (3.2.11-1) ... Setting up mongodb-clients (1:3.4.7-1) ... Setting up mongodb-server (1:3.4.7-1) ... Processing triggers for libc-bin (2.26-0ubuntu2.1) ... ``` --- # MongoDB - Installation ## Ubuntu >= 20.10 ```shell *$ wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | sudo apt-key add - Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)). OK *$ echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.4.list deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse *$ sudo apt update [...] Get:18 https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 Release.gpg [801 B] Get:21 https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4/multiverse amd64 Packages [9 029 B] Get:22 https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4/multiverse arm64 Packages [6 951 B] [...] *$ sudo apt install -y mongodb-org [...] ``` Follow [installation instructions](https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/#install-community-ubuntu-pkg). --- # MongoDB - Configuration file .row[ .column.w48[ :arrow_right: **Old format** ## .hcenter[\[/etc/mongodb.conf\]] ```ini # mongodb.conf # Where to store the data. *dbpath=/home/mongodb #where to log *logpath=/home/mongodb/mongodb.log logappend=true bind_ip = 127.0.0.1 #port = 27017 # Enable journaling, http://www.mongodb.org/display/DOCS/Journaling *journal=false # Enables periodic logging of CPU utilization and I/O wait #cpu = true # Turn on/off security. Off is currently the default #noauth = true #auth = true [...] ``` ] .column.w48[ :arrow_right: **New format** ## .hcenter[\[/etc/mongodb.conf\]] ```ini storage: * dbPath: "/home/mongodb" journal: * enabled: false systemLog: destination: file * path: "/home/mongodb/mongodb.log" logAppend: true net: bindIp: 127.0.0.1 port: 27017 * maxIncomingConnections: 20000 security: authorization: disabled ``` ] ] --- # MongoDB - Start a server instance :arrow_right: With `Systemd`, use `systemctl start unit.service`. ## .hcenter[\[Shell session\]] ```shell *$ systemctl start mongodb.service *$ systemctl status mongodb.service ● mongodb.service - An object/document-oriented database Loaded: loaded (/lib/systemd/system/mongodb.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2018-05-22 21:12:24 CEST; 55s ago Docs: man:mongod(1) Main PID: 748 (mongod) Tasks: 16 (limit: 4915) CGroup: /system.slice/mongodb.service └─748 /usr/bin/mongod --unixSocketPrefix=/run/mongodb --config /etc/mongodb.conf May 22 21:12:24 dell-e5450 systemd[1]: Started An object/document-oriented database. *$ journalctl -u mongodb May 22 21:12:24 dell-e5450 systemd[1]: Started An object/document-oriented database. ``` --- # MongoDB - Insert some documents :arrow_right: To use a database: `use dbName` :arrow_right: To create a collection: `db.createCollection('collectionName')` :arrow_right: To remove a collection: `db.dbName.drop()` :arrow_right: To insert a document: `db.dbName.insertOne(document)` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://127.0.0.1 MongoDB shell version v3.4.7 connecting to: mongodb://127.0.0.1 MongoDB server version: 3.4.7 > use test_db switched to db test_db > db.createCollection('MusicRecords') { "ok": 1 } > db.createCollection('Movies') { "ok": 1 } > db.Movies.drop() true > db.MusicRecords.insertOne({band:'Pink Floyd', song:'Wish You Were Here'}) { "acknowledged" : true, "insertedId" : ObjectId("5b05cfb3c6256b28f54bf25e") } > db.MusicRecords.insertOne({band:'Pink Floyd', song:'Have a Cigar', year: 1975}) { "acknowledged" : true, "insertedId" : ObjectId("5b05cfb3c6256b28f54bf25f") } ``` --- # MongoDB - Retrieve some documents :arrow_right: To retrieve one document: `db.dbName.findOne()` :arrow_right: To retrieve all documents: `db.dbName.find()` :arrow_right: To retrieve all documents matching a query: `db.dbName.find(query)` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://127.0.0.1 MongoDB shell version v3.4.7 connecting to: mongodb://127.0.0.1 MongoDB server version: 3.4.7 > use test_db switched to db test_db > db.MusicRecords.findOne() { "_id" : ObjectId("5b05cfb3c6256b28f54bf25e"), "band" : "Pink Floyd", "song" : "Wish You Were Here" } > db.MusicRecords.find() { "_id" : ObjectId("5b05cfb3c6256b28f54bf25e"), "band" : "Pink Floyd", "song" : "Wish You Were Here" } { "_id" : ObjectId("5b05cfb3c6256b28f54bf25f"), "band" : "Pink Floyd", "song" : "Have a Cigar", "year" : 1975 } > db.MusicRecords.find({year: 1975}) { "_id" : ObjectId("5b05cfb3c6256b28f54bf25f"), "band" : "Pink Floyd", "song" : "Have a Cigar", "year" : 1975 } ``` --- # MongoDB - Users ## Database user :arrow_right: Several users can be defined **for each database** :warning: Different users with the same name can exist for different databases :arrow_right: A user will have to be identified **for a given database** .vspace[] ## User's roles :arrow_right: User permitted actions are defined in **`roles`** :warning: Some roles can allow modifications **to other databases** :arrow_right: Several **built-in roles** exist, but **custom roles** can be constructed .vspace[] ## Privileges :arrow_right: A role is a set of **privileges** :arrow_right: Example of privileges for the `read` built-in role in database `test_db`: * `collStats`, `dbHash`, `dbStats`, `find`, `killCursors`, `listCollections`, `listIndexes`, `planCacheRead` for `test_db.*` collections. --- # MongoDB - Built-in roles .row[ .column.w48[ ## Database User Roles * `read` * `readWrite` ## Database Administration Roles * `dbAdmin` * `dbOwner` * `userAdmin` ## Cluster Administration Roles * `clusterAdmin` * `clusterManager` * `clusterMonitor` * `hostManager` ] .column.w48[ ## Backup and Restoration Roles * `backup` * `restore` ## :warning: All-Database Roles * `readAnyDatabases` * `readWriteAnyDatabase` * `userAdminAnyDatabase` * `dbAdminAnyDatabase` ## :warning: Superuser Roles * `root` ## Internal Role * `__system`: do not use :warning: ] ] --- # MongoDB - Admin user :warning: **Disable authorization** in the configuration file first ! :arrow_right: Create a `root` user with `db.createUser()`. ## .hcenter[\[Shell session\]] ```shell *$ mongo > use admin switched to db admin > db.createUser({user:"toto", pwd: "toto123", roles: ["root"]}) Successfully added user: { "user" : "toto", "roles" : [ "root" ] } ``` --- # MongoDB - Database user :arrow_right: Create a user with `readWrite` built-in role. ## .hcenter[\[Shell session\]] ```shell *$ mongo > use test_db switched to db test_db > db.createUser({user:"user0", pwd: "pwd0", roles: ["readWrite"]}) Successfully added user: { "user" : "user0", "roles" : [ "readWrite" ] } ``` --- # MongoDB - Configuration file with authorization .row[ .column.w48[ :arrow_right: **Old format** ## .hcenter[\[/etc/mongodb.conf\]] ```ini # mongodb.conf # Where to store the data. dbpath=/home/mongodb #where to log logpath=/home/mongodb/mongodb.log logappend=true bind_ip = 127.0.0.1 #port = 27017 # Enable journaling, http://www.mongodb.org/display/DOCS/Journaling journal=false # Enables periodic logging of CPU utilization and I/O wait #cpu = true # Turn on/off security. Off is currently the default #noauth = true *auth = true [...] ``` ] .column.w48[ :arrow_right: **New format** ## .hcenter[\[/etc/mongodb.conf\]] ```ini storage: dbPath: "/home/mongodb" journal: enabled: false systemLog: destination: file path: "/home/mongodb/mongodb.log" logAppend: true net: bindIp: 127.0.0.1 port: 27017 maxIncomingConnections: 20000 security: * authorization: enabled ``` ] ] --- # MongoDB - Restart a server instance :arrow_right: With `Systemd`, use `systemctl restart unit.service`. ## .hcenter[\[Shell session\]] ```shell *$ systemctl restart mongodb.service *$ systemctl status mongodb.service ● mongodb.service - An object/document-oriented database Loaded: loaded (/lib/systemd/system/mongodb.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2018-05-22 23:21:40 CEST; 2s ago Docs: man:mongod(1) Main PID: 5868 (mongod) Tasks: 16 (limit: 4915) CGroup: /system.slice/mongodb.service └─5868 /usr/bin/mongod --unixSocketPrefix=/run/mongodb --config /etc/mongodb.conf May 22 23:21:40 dell-e5450 systemd[1]: Started An object/document-oriented database. *$ journalctl -u mongodb May 22 21:12:24 dell-e5450 systemd[1]: Started An object/document-oriented database. May 22 23:21:40 dell-e5450 systemd[1]: Stopping An object/document-oriented database... May 22 23:21:40 dell-e5450 systemd[1]: Stopped An object/document-oriented database. May 22 23:21:40 dell-e5450 systemd[1]: Started An object/document-oriented database. ``` --- # MongoDB - Login as admin ## Using CLI options :arrow_right: Syntax: `mongo mongodb://user:passwd@host/db` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://toto:toto123@127.0.0.1/admin MongoDB server version: 3.4.7 > ``` ## Using `mongo` commands :arrow_right: Command: `db.auth({user: 'user', pwd: 'pwd'})` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://127.0.0.1 MongoDB shell version v3.4.7 connecting to: mongodb://127.0.0.1 MongoDB server version: 3.4.7 > use admin switched to db admin > db.auth({user:'toto', pwd:'WRONGPASSWORD'}) Error: Authentication failed. 0 > db.auth({user:'toto', pwd:'toto123'}) 1 > ``` --- # MongoDB - Login as user ## Using CLI options :arrow_right: Syntax: `mongo mongodb://user:passwd@host/db` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://user0:pwd0@127.0.0.1/test_db MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 > ``` ## Using `mongo` commands :arrow_right: Command: `db.auth({user: 'user', pwd: 'pwd'})` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://127.0.0.1 MongoDB shell version v3.4.7 connecting to: mongodb://127.0.0.1 MongoDB server version: 3.4.7 > use test_db switched to db test_db > db.auth({user:'user0', pwd:'WRONGPASSWORD'}) Error: Authentication failed. 0 > db.auth({user:'user0', pwd:'pwd0'}) 1 > ``` --- # MongoDB - Multiple authentications :arrow_right: It is possible to authenticate as **different users at the same time**. :arrow_right: Display connectionStatus with `db.runcommand({connectionStatus: 1})`. ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://toto:toto123@127.0.0.1/admin MongoDB server version: 3.4.7 > db.runCommand({connectionStatus: 1}) { "authInfo" : { "authenticatedUsers" : [ { * "user" : "toto", "db" : "admin" } ], "authenticatedUserRoles" : [ { "role" : "root", "db" : "admin" } ] }, "ok" : 1 } ``` --- # MongoDB - Multiple authentications ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://toto:toto123@127.0.0.1/admin MongoDB server version: 3.4.7 > use test_db switched to db test_db > db.auth({user:'user0', pwd:'pwd0'}) 1 > db.runCommand({connectionStatus: 1}) { "authInfo" : { "authenticatedUsers" : [ { * "user" : "toto", "db" : "admin" }, { * "user" : "user0", "db" : "test_db" } ], "authenticatedUserRoles" : [ { "role" : "root", "db" : "admin" }, { "role" : "read", "db" : "test_db" } ] }, "ok" : 1 } ``` --- # MongoDB - List users :arrow_right: Use `show users` to list users of a given database. ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://user0:pwd0@127.0.0.1/test_db MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 > show users 2018-05-22T23:44:56.226+0200 E QUERY [thread1] Error: not authorized on test_db to execute command { usersInfo: 1.0 } : _getErrorWithCode@src/mongo/shell/utils.js:25:13 DB.prototype.getUsers@src/mongo/shell/db.js:1539:1 shellHelper.show@src/mongo/shell/utils.js:752:9 shellHelper@src/mongo/shell/utils.js:659:15 @(shellhelp2):1:1 ``` :warning: Not authorized ! --- # MongoDB - List users :arrow_right: Recreate user `user0` with `dbOwner` role. :arrow_right: Use command `db.dropUser('username')` to remove a user. ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://toto:toto123@127.0.0.1/admin MongoDB server version: 3.4.7 > use test_db switched to db test_db > db.dropUser('user0') true > db.createUser({user:'user0', pwd:'pwd0', roles:["dbOwner"]}) Successfully added user: { "user" : "user0", "roles" : [ "dbOwner" ] } > exit bye *$ mongo mongodb://user0:pwd0@127.0.0.1/test_db MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 *> show users { "_id" : "test_db.user0", "user" : "user0", "db" : "test_db", "roles" : [ { "role" : "dbOwner", "db" : "test_db" } ] } ``` --- # MongoDB - List databases / collections :arrow_right: Use command `show databases` to list databases. :arrow_right: Use command `show collections` to list collections of the current database. ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://toto:toto123@127.0.0.1/admin MongoDB server version: 3.4.7 *> show databases admin 0.000GB local 0.000GB test_db 0.000GB ``` ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://user0:pwd0@127.0.0.1/test_db MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 *> show collections MusicRecords ``` --- # MongoDB - View user's roles ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://user0:pwd0@127.0.0.1/test_db MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 *> show roles { "role" : "dbAdmin", "db" : "test_db", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ] } { "role" : "dbOwner", "db" : "test_db", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ] } [...] { "role" : "readWrite", "db" : "test_db", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ] } { "role" : "userAdmin", "db" : "test_db", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ] } ``` --- # MongoDB - View user's roles (other way) ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 > use test_db switched to db test_db > db.auth({user:'user0',pwd:'pwd0'}) 1 > db.getRoles() [ ] *> db.getRoles({showBuiltinRoles: true}) [ { "role" : "__system", "db" : "admin", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ] }, [...] ] > use test_db switched to db test_db *> db.getRoles({showBuiltinRoles: true}) [ { "role" : "dbAdmin", "db" : "test_db", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ] }, [...] ] ``` --- # MongoDB - View user's privileges :arrow_right: Use command `db.getRole('read', {showPrivileges: true})` to get privileges. ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://user0:pwd0@127.0.0.1/test_db MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 > db.getRole('read', {showPrivileges: true}) { "role" : "read", "db" : "test_db", "isBuiltin" : true, "roles" : [ ], "inheritedRoles" : [ ], "privileges" : [ [...] { "resource" : { "db" : "test_db", "collection" : "system.indexes" }, "actions" : [ "collStats", "dbHash", "dbStats", "find", "killCursors", "listCollections", "listIndexes", "planCacheRead" ] }, [...] ``` --- # MongoDB - Change user's password :arrow_right: Use command `db.changeUserPassword` to change a password. ## .hcenter[\[Shell session\]] ```shell *$ mongo mongodb://toto:toto123@127.0.0.1/admin MongoDB shell version v3.4.7 connecting to: mongodb://user0:pwd0@127.0.0.1/test_db MongoDB server version: 3.4.7 > use test_db switched to db test_db *> db.changeUserPassword("user0", "plop") > db.auth({user:'user0', pwd:'pwd0'}) Error: Authentication failed. 0 > db.auth({user:'user0', pwd:'plop'}) 1 ``` --- # MonoDB - `GridFS` :arrow_right: Use a mongoDB database as a file system. ## .hcenter[\[Shell session\]] ```shell $ # put a file $ mongofiles --verbose=-1 -u user0 -p pwd0 -d test_db put test_numpy.py added file: test_numpy.py $ # list files (and sizes) $ mongofiles --verbose=-1 -u user0 -p pwd0 -d test_db list test_numpy.py 976 $ # get a file $ mongofiles --verbose=-1 -u user0 -p pwd0 -d test_db get test_numpy.py finished writing to test_numpy.py $ # delete a file $ mongofiles --verbose=-1 -u user0 -p pwd0 -d test_db delete test_numpy.py successfully deleted all instances of 'test_numpy.py' from GridFS ``` --- # MongoDB - Logs :arrow_right: The path for the log file is specified in the `systemLog` section of the config file. ## .hcenter[\[/etc/mongodb.conf\]] ```json storage: dbPath: "/home/mongodb" journal: enabled: false systemLog: destination: file * path: "/home/mongodb/mongodb.log" logAppend: true net: bindIp: 127.0.0.1 port: 27017 maxIncomingConnections: 20000 security: authorization: enabled ``` --- # MongoDB - Logs ## .hcenter[\[/home/mongodb/mongodb.log\]] .nowrap[ ```accesslog 2018-05-21T00:02:48.089+0200 I CONTROL [initandlisten] MongoDB starting : pid=3882 port=27017 dbpath=/home/mongodb 64-bit host=dell-e5450 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] db version v3.4.7 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] git version: cf38c1b8a0a8dca4a11737581beafef4fe120bcd 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] allocator: tcmalloc 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] modules: none 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] build environment: 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] distarch: x86_64 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] target_arch: x86_64 2018-05-21T00:02:48.090+0200 I CONTROL [initandlisten] options: { config: "/etc/mongodb.conf", net: { bindIp: "127.0.0.1", unixDomainSocket: { pathPrefix: "/run/mongodb" } }, storage: { dbPath: "/home/mongodb", journal: { enabled: true } }, systemLog: { destination: "file", logAppend: true, path: "/home/mongodb/mongodb.log" } } [...] 2018-05-23T00:15:20.591+0200 I NETWORK [thread1] connection accepted from 127.0.0.1:35158 #21 (1 connection now open) 2018-05-23T00:15:20.591+0200 I NETWORK [conn21] received client metadata from 127.0.0.1:35158 conn21: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.7" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "17.10" } } 2018-05-23T00:15:20.603+0200 I ACCESS [conn21] Successfully authenticated as principal toto on admin 2018-05-23T00:15:20.604+0200 I ACCESS [conn21] Unauthorized: not authorized on admin to execute command { getLog: "startupWarnings" } 2018-05-23T00:15:20.606+0200 I ACCESS [conn21] Unauthorized: not authorized on admin to execute command { replSetGetStatus: 1.0, forShell: 1.0 } 2018-05-23T00:15:27.230+0200 I ACCESS [conn21] Unauthorized: not authorized on test_db to execute command { listCollections: 1.0, filter: {} } 2018-05-23T00:16:14.463+0200 I ACCESS [conn21] SCRAM-SHA-1 authentication failed for user0 on test_db from client 127.0.0.1:35158 ; AuthenticationFailed: SCRAM-SHA-1 authentication failed, storedKey mismatch 2018-05-23T00:16:26.888+0200 I ACCESS [conn21] Successfully authenticated as principal user0 on test_db 2018-05-23T00:16:26.889+0200 I ACCESS [conn21] Unauthorized: not authorized on admin to execute command { replSetGetStatus: 1.0, forShell: 1.0 } 2018-05-23T00:18:23.326+0200 I - [conn21] end connection 127.0.0.1:35158 (1 connection now open) ``` ] --- # MongoDB - `GUI` ? :arrow_right: Several MongoDB GUIs exist. * [`robo3t`](https://github.com/Studio3T/robomongo) (former `Robomongo`): `GPLv3`, enhanced shell. * [`studio3t`](https://studio3t.com/) (former `MongoChef`): **free for non-commercial use**, full GUI. * [`adminMongo`](https://adminmongo.markmoffat.com/): `MIT` license, web-interface. * many others (non-free, windows or macOS only, etc...) --- # MongoDB - `adminMongo` ## .hcenter[\[Shell session\]] ```shell $ git clone https://github.com/mrvautin/adminMongo.git && cd adminMongo [...] $ npm install [...] $ npm start > admin-mongo@0.0.23 start /home/dubrayn/temp/adminMongo > node app.js adminMongo listening on host: http://0.0.0.0:1234 ``` :arrow_right: `adminMongo` is [active](http://0.0.0.0:1234). --- # MongoDB - `adminMongo` .hcenter.shadow.w100[] --- # MongoDB - `adminMongo` .hcenter.shadow.w100[] --- # MongoDB - `adminMongo` .hcenter.shadow.w100[] --- # MongoDB - `adminMongo` .hcenter.shadow.w100[] --- class: top .vspace[] .vspace[] # MongoDB - Conclusions .hcenter.w40[] :arrow_right: We have just seen a **tiny part** of what MongoDB is capable of. :bulb: Browse more doc/examples/tutorials at [https://www.mongodb.com](https://www.mongodb.com). ## Interactive use There are two ways to **interactively** use a MongoDB database: * connect with the mongo shell CLI tool, or * use a GUI/webGUI. ## Non-interactive use How to let a **`Python` program** use a MongoDB database ? :arrow_right: Connect to the server listening socket, write mongo shell commands and parse results -- ? :warning: **NO** :warning: :arrow_right: Use MongoDB `Python` bindings ? :v: **YES** :v: :bulb: Can even work with interactive use from `Python`... --- # `PyMongo` ## In a nutshell :arrow_right: Full-featured MongoDB `Python` bindings :+: The recommended way to use MongoDB from `Python` :+: Compatible with `CPython` `2.6`, `2.7`, `3.4+`, `PyPy`, and `PyPy3` .vspace[] .block[ * Repository: [https://github.com/mongodb/mongo-python-driver](https://github.com/mongodb/mongo-python-driver) * Website: [http://api.mongodb.com/python/current/](http://api.mongodb.com/python/current/) * License: `Apache` license ] --- # `PyMongo` - Hello world ## .hcenter[\[mongodb/hello_world.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'admin0', 'pwd0', '127.0.0.1', 'admin' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: print(client.list_database_names()) # list databases db = client['test_db'] print(db.list_collection_names()) # list collections except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/hello_world.py ['admin', 'config', 'local', 'test_db'] ['MusicRecords'] ``` --- # `PyMongo` - Insert document ## .hcenter[\[mongodb/insert_document.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db['pycollection'] for i in range(3): data = { "run": 0, "time": 0.01 * i, "norm": 1.0} data_id = pycollection.insert_one(data).inserted_id print("id: %s" % (str(data_id))) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/insert_document.py id: 6063439e1c836443066c6ff4 id: 6063439e1c836443066c6ff5 id: 6063439e1c836443066c6ff6 ``` --- # `PyMongo` - Retrieve documents ## .hcenter[\[mongodb/retrieve_documents.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db.pycollection print("### find_one()") print(pycollection.find_one()) print("### find()") for document in pycollection.find(): print(document) print("### find({'time': 0.01})") for document in pycollection.find({'time': 0.01}): print(document) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/retrieve_documents.py ### find_one() {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} ### find() {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} ### find({'time': 0.01}) {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} ``` --- # `PyMongo` - Search document by `_id` :arrow_right: Construct an `ObjectId` instance from the `_id` string value. ## .hcenter[\[mongodb/retrieve_by_id.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors from bson.objectid import ObjectId username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db['pycollection'] myId = ObjectId('6063439e1c836443066c6ff5') print(pycollection.find_one({'_id': myId})) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/retrieve_by_id.py {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} ``` --- # `PyMongo` - Insert multiple documents ## .hcenter[\[mongodb/insert_documents.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db['pycollection'] data = [{"pipo": i} for i in range(3)] pycollection.insert_many(data) for d in pycollection.find(): print(str(d)) for i in range(3): pycollection.delete_one({"pipo": i}) print("after...") for d in pycollection.find(): print(str(d)) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/insert_documents.py {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} {'_id': ObjectId('606345904c49d6d63af42fa3'), 'pipo': 0} {'_id': ObjectId('606345904c49d6d63af42fa4'), 'pipo': 1} {'_id': ObjectId('606345904c49d6d63af42fa5'), 'pipo': 2} after... {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} ``` --- # `PyMongo` - Replace a document ## .hcenter[\[mongodb/replace_document.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors from bson.objectid import ObjectId username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db['pycollection'] pycollection.replace_one({'time': 0.01}, {'toto': 'tutu'}) for d in pycollection.find(): print(str(d)) pycollection.replace_one({'toto': 'tutu'}, {'run': 0, 'time': 0.01, 'norm': 1.0}) print('after...') for d in pycollection.find(): print(str(d)) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/replace_document.py {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'toto': 'tutu'} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} after... {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} ``` --- # `PyMongo` - Update a document ## .hcenter[\[mongodb/update_document.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors from bson.objectid import ObjectId username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db['pycollection'] pycollection.update_one({'time': 0.01}, {'$inc': {'run': 3}}) for d in pycollection.find(): print(str(d)) pycollection.update_one({'time': 0.01}, {'$inc': {'run': -3}}) print('after...') for d in pycollection.find(): print(str(d)) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/update_document.py {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 3, 'time': 0.01, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} after... {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff6'), 'run': 0, 'time': 0.02, 'norm': 1.0} ``` --- # `PyMongo` - Count :arrow_right: Get the number of documents matching a criterion. ## .hcenter[\[mongodb/count.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors from bson.objectid import ObjectId username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db['pycollection'] print(pycollection.count_documents({'time': 0.01})) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/count.py 1 ``` --- # `PyMongo` - Range queries :arrow_right: Use [query selectors](https://docs.mongodb.com/manual/reference/operator/query/) to use **advanced queries**. :bulb: Use `sort()` to sort results. ## .hcenter[\[mongodb/range_queries.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db.pycollection for document in pycollection.find({"time": {"$lt": 0.015}}).sort('time'): print(document) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/range_queries.py {'_id': ObjectId('6063439e1c836443066c6ff4'), 'run': 0, 'time': 0.0, 'norm': 1.0} {'_id': ObjectId('6063439e1c836443066c6ff5'), 'run': 0, 'time': 0.01, 'norm': 1.0} ``` --- # `PyMongo` - Indexing :arrow_right: Use `create_index()` to identify a key as an index for a collection. ## .hcenter[\[mongodb/create_index.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db.pycollection pycollection.create_index('time', unique = True) pycollection.insert_one({'time': 0.01}) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/create_index.py ERROR: E11000 duplicate key error collection: test_db.pycollection index: time_1 dup key: { time: 0.01 }, full error: {'index': 0, 'code': 11000, 'keyPattern': {'time': 1}, 'keyValue': {'time': 0.01}, 'errmsg': ' E11000 duplicate key error collection: test_db.pycollection index: time_1 dup key: { time: 0.01 }'} ``` --- # `PyMongo` - Indexing :arrow_right: Use `drop_index()` to remove an existing index for a collection. :bulb: Use `list_indexes()` to get a list of indexes. ## .hcenter[\[mongodb/remove_index.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client.test_db pycollection = db.pycollection pycollection.drop_index('time_1') for i in pycollection.list_indexes(): print(i) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/remove_index.py SON([('v', 2), ('key', SON([('_id', 1)])), ('name', '_id_')]) ``` --- # `PyMongo` - Numpy arrays ## .hcenter[\[mongodb/numpy_to_mongodb.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors import numpy as np import bson import pickle username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: for i in range(3): mat = np.eye(3) * i bindat = bson.binary.Binary(pickle.dumps(mat, protocol = 2)) data_id = client['test_db']['numpy_test'].insert_one({'mat': bindat}).inserted_id print("id: %s" % (str(data_id))) print(mat) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/numpy_to_mongodb.py id: 606347c23629cec2ce762e84 [[0. 0. 0.] [0. 0. 0.] [0. 0. 0.]] id: 606347c23629cec2ce762e85 [[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]] id: 606347c23629cec2ce762e86 [[2. 0. 0.] [0. 2. 0.] [0. 0. 2.]] ``` --- # `PyMongo` - Numpy arrays ## .hcenter[\[mongodb/mongodb_to_numpy.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors import numpy as np import bson import pickle username, password, host, dbname = 'user0', 'pwd0', '127.0.0.1', 'test_db' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: for data in client['test_db']['numpy_test'].find(): bindat = data["mat"] data_id = data["_id"] mat = pickle.loads(bindat) print("id: %s" % (str(data_id))) print(mat) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` ## .hcenter[\[Shell session\]] ```shell $ mongodb/mongodb_to_numpy.py id: 606347c23629cec2ce762e84 [[0. 0. 0.] [0. 0. 0.] [0. 0. 0.]] id: 606347c23629cec2ce762e85 [[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]] id: 606347c23629cec2ce762e86 [[2. 0. 0.] [0. 2. 0.] [0. 0. 2.]] ``` --- # `PyMongo` - `GridFS` ## .hcenter[\[mongodb/gridfs_example.py\]] ```Python #!/usr/bin/env python3 from pymongo import MongoClient import pymongo.errors from bson.objectid import ObjectId import hashlib import gridfs def hash_content(content): m = hashlib.md5() m.update(content) return m.hexdigest() username, password, host, dbname = 'admin0', 'pwd0', '127.0.0.1', 'admin' client = MongoClient('mongodb://%s:%s@%s/%s' % (username, password, host, dbname)) try: db = client['test_gridfs'] gridfs = gridfs.GridFS(db) filename = 'toto' content = '12f18481bh192d0 08 d h2j011jp'.encode('utf-8') print(hash_content(content)) # save content as a GridFS file gridfs.put(content, filename = filename) # load content from a GridFS file fp = gridfs.get_last_version(filename = filename) print(hash_content(fp.read())) except pymongo.errors.OperationFailure as e: print("ERROR: %s" % (e)) ``` --- # `PyMongo` - `GridFS` ## .hcenter[\[Shell session\]] ```shell $ mongodb/gridfs_example.py 4d5e1510e879cf18a2b05e89529a847b 4d5e1510e879cf18a2b05e89529a847b ``` :arrow_right: Use the CLI program `mongofiles` to list, get, put, delete... gridFS files. ## .hcenter[\[Shell session\]] ```shell $ # list files $ mongofiles --verbose=-1 -h 127.0.0.1 -u admin0 -p pwd0 --authenticationDatabase=admin -d test_gridfs list toto 29 $ # get file $ mongofiles --verbose=-1 -h 127.0.0.1 -u admin0 -p pwd0 --authenticationDatabase=admin -d test_gridfs get toto $ # compute MD5 $ md5sum toto 4d5e1510e879cf18a2b05e89529a847b toto $ # delete file $ mongofiles --verbose=-1 -h 127.0.0.1 -u admin0 -p pwd0 --authenticationDatabase=admin -d test_gridfs delete toto ```