Path: blob/main/examples/getting-started.ipynb
469 views
Getting Started
Install the SingleStore package
The SingleStore package can be installed the following way:
In addition, you can install the SQLAlchemy and Ibis plugins with the following:
Import SingleStore
Create a connection
This function is compliant with the Python DB-API 2.0. In addition, it allows you to use a URL as a connection string rather than individual parameters. Parameters may also be set using environment variable (e.g., SINGLESTOREDB_HOST
, SINGLESTOREDB_USER
, SINGLESTOREDB_PASSWORD
, etc.), but it is easiest to specify the entire URL in the SINGLESTOREDB_URL
environment variable.
The URL in the code above can take a couple of forms. The default driver is mysql
, but you can also specify http
for connecting to the SingleStoreDB data API.
Environment Variables
Connection URLs can also be set using an environment variable. This allows you to create environments that already have connection parameters embedded in them so that new connections can be made without specifying any parameters.
Cursors
Most interaction with databases are done using cursors. To create a cursor from a connection, you use the cursor
method.
Create tables and insert data
Using the cursor, we can execute commands for creating tables and inserting data.
Reset Tables
Insert Data Using Named Parameters
Using named parameters such as :foo
is the default parameter format in queries. This requires dictionaries for the parameter structure.
Insert Data Using Positional Parameters
You can also use positional parameters which specify the one-based index in a list or tuple.
Insert Data From DataFrame
In addition to this method, you can use the pandas.DataFrame.to_sql
method with a SQLAlchemy SingleStore connection object.
Querying data
Here we are executing a simple SELECT
operation.
There are various ways of fetching data including the fetchone()
, fetchmany()
, and fetchall()
methods description in the DB-API. In addition, you can simply iterate over the cursor itself.
By default, results come back in tuples, but there are other result structure options as well, they include namedtuple
and dict
.
The following is a query using parameter substitution. While the underlying connection object is a mysql.connector
-based object which takes the pyformat parameter format, we are using the sqlparams
package to allow us to use whatever substitution form we want. Since the HTTP API uses question marks for parameters, we have adopted that format here as well.
The description
field returns information about the query results. It includes fields such as name
, type_code
, display_size
, internal_size
, etc. These are defined in the DB-API, but are not all populated.
Server Variables
Server variables can be accessed through a number of dictionary-like members on the connection. The members are globals
, locals
, cluster_globals
, cluster_locals
when a scope is specified, or vars
and cluster_vars
for all variables regardless of scope.
Enabling the HTTP API
The HTTP API can be enabled using the conn.enable_http_api
method. This method can optionally set the port number. If a port number is not specified, the existing setting will be used. The port number is returned.
Create an HTTP connection
Query data using the HTTP connection
Workspace Management
The objects allow you to manage workspaces and create database connections to those workspaces. In order for this call to work, you either need to pass in a cluster management API token or have one set in your SINGLESTOREDB_MANAGEMENT_TOKEN
environment variable.
Get Current Clusters and Regions
Objects returned by WorkspaceManager.workspace_groups
, WorkspaceManager.regions
, and WorkspaceGroup.workspaces
can be indexed by name and ID as well.
Create a Workspace Group
Create a Workspace in the Group
Connect to the Workspace
Working with Stages
Stages is a place when you can store data and SQL files for use in your database. Stage files can be written to interactively, or be used as files by other Python objects. Stages are created on a per-workspacegroup basis, so we'll use the workspace group created above.
Currently there are no files in our Stage.
Uploading a file
The most basic usage of stages is to upload files from disk. We will upload a local file named test.sql
to the stage_test.sql
file.
Information about the file can be accessed using the resulting object's attributes and methods.
Downloading content
The file contents can also be downloaded. Note that the returned content will by in bytes unless you specify an encoding=
parameter.
Using the file interface
It is also possible to treat a Stage file as a local file using Python's file interface. Note that this creates a temporary in-memory file until the final read / write is completed, then the data is synced.
Since the open
method returns an object that supports Python's file interface, we can use it from other APIs as well such as pandas' read_csv
function.