I am having the following problem: I am running the metabase on a docker instance in MacOS and my database is on the host (MongoDB). I am using the following docker-compose.yml configuration: version: '3' services: metabase: conta. Metabase backend on MongoDB. To install CPAN::Testers::Metabase::MongoDB, copy and paste the appropriate command in to your terminal.
Metabase index on MongoDB. To install Metabase::Index::MongoDB, copy and paste the appropriate command in to your terminal. Starting with MongoDB 4.4, mongoexport is now released separately from the MongoDB Server and uses its own versioning, with an initial version of 100.0.0.Previously, mongoexport was released alongside the MongoDB Server and used matching versioning. For documentation on the MongoDB 4.2 or earlier versions of mongoexport, reference the MongoDB Server Documentation for.
Aggregation operations process data records and return computedresults. Aggregation operations group values from multiple documentstogether, and can perform a variety of operations on the grouped datato return a single result. MongoDB provides three ways to performaggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods.
Aggregation Pipeline¶
MongoDB's aggregation framework is modeled on the concept of dataprocessing pipelines. Documents enter a multi-stage pipeline thattransforms the documents into an aggregated result. For example:
First Stage: The $match
stage filters the documents bythe status
field and passes to the next stage those documents thathave status
equal to 'A'
.
Second Stage: The $group
stage groups the documents bythe cust_id
field to calculate the sum of the amount for eachunique cust_id
.
The most basic pipeline stages provide filters that operate likequeries and document transformations that modify the formof the output document.
Other pipeline operations provide tools for grouping and sortingdocuments by specific field or fields as well as tools for aggregatingthe contents of arrays, including arrays of documents. In addition,pipeline stages can use operators for tasks such as calculating theaverage or concatenating a string.
The pipeline provides efficient data aggregation using nativeoperations within MongoDB, and is the preferred method for dataaggregation in MongoDB.
The aggregation pipeline can operate on asharded collection.
What Is Metabase
The aggregation pipeline can use indexes to improve its performanceduring some of its stages. In addition, the aggregation pipeline has aninternal optimization phase. SeePipeline Operators and Indexes andAggregation Pipeline Optimization for details.
Single Purpose Aggregation Operations¶
MongoDB also provides db.collection.estimatedDocumentCount()
,db.collection.count()
and db.collection.distinct()
.
All of these operations aggregate documents from a single collection.While these operations provide simple access to common aggregationprocesses, they lack the flexibility and capabilities of an aggregationpipeline.
Map-Reduce¶
An aggregation pipeline providesbetter performance and usability than a map-reduce operation.
Map-reduce operations can be rewritten using aggregation pipelineoperators, such as$group
, $merge
, and others.
For map-reduce operations that require custom functionality, MongoDBprovides the $accumulator
and $function
aggregation operators starting in version 4.4. Use these operators todefine custom aggregation expressions in JavaScript.
For examples of aggregation pipeline alternatives to map-reduceoperations, see Map-Reduce to Aggregation Pipeline andMap-Reduce Examples.
Additional Features and Behaviors¶
For a feature comparison of the aggregation pipeline,map-reduce, and the special group functionality, seeAggregation Commands Comparison.
v0.39.0.1 / Administration Guide / Mongodb
This article covers:
- Connecting to MongoDB.
- Configuring SSL via the command line.
- Connecting to a MongoDB Atlas cluster.
- General connectivity concerns.
How does MongoDB work in Metabase
Because MongoDB contains unstructured data, Metabase takes a different approach to syncing your database’s metadata. To get a sense of the schema, Metabase will scan the first 200 documents of each collection in your MongoDB. This sampling helps Metabase do things like differentiate datetime fields from string fields, and provide people with pre-populated filters. The reason Metabase only scans a sample of the documents is because scanning every document in every collection on every sync would be put too much strain on your database. And while the sampling does a pretty good job keeping Metabase up to date, it can also mean that new fields can sometimes fall through the cracks, leading to visualization issues, or even fields failing to appear in your results. For more info, check out our troubleshooting guide.
Connecting to MongoDB
Go to Admin -> Databases, and click the Add database button. Select MongoDB from the dropdown.
There are two ways to connect to MongoDB:
- Using the Metabase fields to input your connection details.
- Pasting your connection string.
Using Metabase fields
The default way to connect to MongoDB is to fill out your connection details in the fields Metabase provides:
- Host
- Database name
- Port
- Username
- Password
- Authentication Database (optional database to use when authenticating)
- Additional Mongo connection string options
Additional settings:
Use DNS SRV when connecting Using this option requires that provided host is a FQDN. If connecting to an Atlas cluster, you might need to enable this option. If you don’t know what this means, leave this disabled.
Use a secure connection (SSL)? Enable SSL and paste the contents of the server’s SSL certificate chain in the input text box. This option is available for this method of connection only (i.e. you cannot include a certificate when connecting with a connection string).
Using a connection string
If you’d prefer to connect to MongoDB using a connection string,click on Paste a connection string. The Metabase user interface will update with a field to paste your connection string.
Metabase currently does NOT support the following connection string parameters:
tlsCertificateKeyFile
tlsCertificateKeyFilePassword
tlsCAFile
Mongo Insert Date
If you need to use a certificate, connect via the default method and enable Use a secure connection(SSL).
Settings common to both connection options
- Use an SSH tunnel for database connections. Some database installations can only be accessed by connecting through an SSH bastion host. This option also provides an extra layer of security when a VPN is not available. Enabling this is usually slower than a direct connection.
- Automatically run queries when doing simple filtering and summarizing. When this is on, Metabase will automatically run queries when users do simple explorations with the Summarize and Filter buttons when viewing a table or chart. You can turn this off if querying this database is slow. This setting doesn’t affect drill-throughs or SQL queries.
- This is a large database, so let me choose when Metabase syncs and scans. By default, Metabase does a lightweight hourly sync and an intensive daily scan of field values. If you have a large database, we recommend turning this on and reviewing when and how often the field value scans happen.
Configuring SSL via the command line
You can enter a self-signed certificate via the Metabase UI (though not when using a connection string), or you can use the command line to add a self-signed certificate.
Then, start Metabase using the store:
Learn more about configuring SSL with MongoDB.
Connecting to a MongoDB Atlas cluster
To make sure you are using the correct connection configuration:
Log into your Atlas cluster
Select the cluster you want to connect to, and click Connect.
Click Connect Your Application.
Select Java and 3.6 or later.
The resulting connection string has the relevant information to provide to Metabase’s
Add a Database
form for MongoDB.You will likely want to select the option
Use DNS SRV
, which newer Atlas clusters use by default.
General connectivity concerns
- Connect using
DNS SRV
, which is the recommended method for newer Atlas clusters. - Have you checked your cluster host whitelist? When testing a connection but seeing failure, have you tried setting the IP whitelist to
0.0.0.0/0
? Whitelisting this address allows connections from any IP addresses. If you know the IP address(es) or CIDR block of clients, use that instead. - Connect to the secondary server. When connecting to a cluster, always use the
?readPreference=secondary
argument in the connection string, which allows Metabase to read from a secondary server instead of consuming resources from the primary server.