Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
singlestore-labs
GitHub Repository: singlestore-labs/singlestoredb-python
Path: blob/main/examples/getting-started.ipynb
469 views
Kernel: Python 3 (ipykernel)

Getting Started

Install the SingleStore package

The SingleStore package can be installed the following way:

pip install singlestoredb

In addition, you can install the SQLAlchemy and Ibis plugins with the following:

pip install singlestore[dataframe]

Import SingleStore

import singlestoredb as s2

Create a connection

This function is compliant with the Python DB-API 2.0. In addition, it allows you to use a URL as a connection string rather than individual parameters. Parameters may also be set using environment variable (e.g., SINGLESTOREDB_HOST, SINGLESTOREDB_USER, SINGLESTOREDB_PASSWORD, etc.), but it is easiest to specify the entire URL in the SINGLESTOREDB_URL environment variable.

conn_url = 'root:@127.0.0.1:9306/x_db'
conn = s2.connect(conn_url)

The URL in the code above can take a couple of forms. The default driver is mysql, but you can also specify http for connecting to the SingleStoreDB data API.

http://root:@localhost:9000/x_db

Environment Variables

Connection URLs can also be set using an environment variable. This allows you to create environments that already have connection parameters embedded in them so that new connections can be made without specifying any parameters.

import os os.environ['SINGLESTOREDB_URL'] = conn_url

Cursors

Most interaction with databases are done using cursors. To create a cursor from a connection, you use the cursor method.

cur = conn.cursor()

Create tables and insert data

Using the cursor, we can execute commands for creating tables and inserting data.

Reset Tables

cur.execute(r'drop table if exists departments') cur.execute(r'drop table if exists employees') cur.execute(r'drop table if exists salaries') cur.execute(r''' create table if not exists departments ( id int, name varchar(255), primary key (id) );''') cur.execute(r''' create table if not exists employees ( id int, deptId int, managerId int, name varchar(255), hireDate date, state char(2), primary key (id) );''') cur.execute(r''' create table if not exists salaries ( employeeId int, salary int, primary key (employeeId) );''')
0

Insert Data Using Named Parameters

Using named parameters such as :foo is the default parameter format in queries. This requires dictionaries for the parameter structure.

cur.executemany(r'insert into departments(id, name) values (%(id)s, %(name)s)', [ dict(id=1, name='Marketing'), dict(id=2, name='Finance'), dict(id=3, name='Sales'), dict(id=4, name='Customer Service'), ] )
4

Insert Data Using Positional Parameters

You can also use positional parameters which specify the one-based index in a list or tuple.

cur.executemany(r'insert into employees (id, deptId, managerId, name, hireDate, state) ' r'values (%s, %s, %s, %s, %s, %s)', [ (1, 2, None, "Karly Steele", "2011-08-25", "NY"), (2, 1, 1, "Rhona Nichols", "2008-09-11", "TX"), (3, 4, 2, "Hedda Kent", "2005-10-27", "TX"), (4, 2, 1, "Orli Strong", "2001-07-01", "NY"), (5, 1, 1, "Leonard Haynes", "2011-05-30", "MS"), (6, 1, 5, "Colette Payne", "2002-10-22", "MS"), (7, 3, 4, "Cooper Hatfield", "2010-08-19", "NY"), (8, 2, 4, "Timothy Battle", "2001-01-21", "NY"), (9, 3, 1, "Doris Munoz", "2008-10-22", "NY"), (10, 4, 2, "Alea Wiggins", "2007-08-21", "TX"), ] )
10

Insert Data From DataFrame

In addition to this method, you can use the pandas.DataFrame.to_sql method with a SQLAlchemy SingleStore connection object.

import pandas as pd df = pd.DataFrame( [ (1, 885219), (2, 451519), (3, 288905), (4, 904312), (5, 919124), (6, 101538), (7, 355077), (8, 900436), (9, 41557), (10, 556263), ], columns=['employeeId', 'salary']) cur.executemany(r'insert into salaries (employeeId, salary) ' r'values (%s, %s)', df)
10

Querying data

Here we are executing a simple SELECT operation.

cur.execute('select name from employees')
10

There are various ways of fetching data including the fetchone(), fetchmany(), and fetchall() methods description in the DB-API. In addition, you can simply iterate over the cursor itself.

for item in cur: print(item)
('Cooper Hatfield',) ('Rhona Nichols',) ('Orli Strong',) ('Karly Steele',) ('Hedda Kent',) ('Doris Munoz',) ('Leonard Haynes',) ('Colette Payne',) ('Timothy Battle',) ('Alea Wiggins',)

By default, results come back in tuples, but there are other result structure options as well, they include namedtuple and dict.

conn = s2.connect(conn_url, results_type='dicts')
cur = conn.cursor() cur.execute('select name from employees') cur.fetchall()
[{'name': 'Cooper Hatfield'}, {'name': 'Rhona Nichols'}, {'name': 'Orli Strong'}, {'name': 'Karly Steele'}, {'name': 'Hedda Kent'}, {'name': 'Doris Munoz'}, {'name': 'Leonard Haynes'}, {'name': 'Colette Payne'}, {'name': 'Timothy Battle'}, {'name': 'Alea Wiggins'}]

The following is a query using parameter substitution. While the underlying connection object is a mysql.connector-based object which takes the pyformat parameter format, we are using the sqlparams package to allow us to use whatever substitution form we want. Since the HTTP API uses question marks for parameters, we have adopted that format here as well.

cur.execute('select name, hireDate from employees where name like %s', ['%Rhona%'])
1

The description field returns information about the query results. It includes fields such as name, type_code, display_size, internal_size, etc. These are defined in the DB-API, but are not all populated.

cur.description
(Description(name='name', type_code=253, display_size=None, internal_size=255, precision=255, scale=0, null_ok=True, flags=0, charset=33), Description(name='hireDate', type_code=10, display_size=None, internal_size=30, precision=30, scale=31, null_ok=True, flags=0, charset=33))
cur.fetchall()
[{'name': 'Rhona Nichols', 'hireDate': datetime.date(2008, 9, 11)}]

Server Variables

Server variables can be accessed through a number of dictionary-like members on the connection. The members are globals, locals, cluster_globals, cluster_locals when a scope is specified, or vars and cluster_vars for all variables regardless of scope.

dict(conn.globals)
{'active_query_users_blacklist': '', 'activities_delta_sleep_s': '1', 'advanced_hdfs_pipelines': False, 'aes_default_encryption_mode': 'aes-128-ecb', 'aggregator_failure_detection': False, 'allow_async_bottomless': False, 'allow_ipv6': False, 'allow_newer_to_older_replication': False, 'allow_proc_cpuinfo_errors': False, 'assert_on_cluster_db_reprovisioning': False, 'async_backup': True, 'auditlog_disk_sync': False, 'auditlog_level': False, 'auditlog_retention_period': '0', 'auditlog_rotation_size': '134217728', 'auditlog_rotation_time': '3600', 'auto_replicate': False, 'autocommit': True, 'autostats_flush_interval_secs': '600', 'backup_max_threads': '0', 'backup_multipart_upload_concurrency': '3', 'basedir': '/src/memsql/release', 'batch_external_functions': 'ALWAYS', 'batch_external_functions_size': '512', 'blob_cache_eviction_policy': 'LRU2', 'blob_cache_threadpool_max_concurrent': '32', 'bottomless_blob_compression_level': '0', 'bottomless_compression_level': '1', 'bottomless_gc_retention_period_minutes': '1440', 'bottomless_gc_retention_period_test_mode': True, 'bottomless_idle_upload_ms': '60000', 'bottomless_incremental_download_max_buffer_size': '1048576', 'bottomless_inject_retriable_download_errors': False, 'bottomless_inject_retriable_upload_errors': False, 'bottomless_skip_create_database_api_checks': False, 'bottomless_snapshot_trigger_log_chunks': '300', 'bottomless_upload_max_concurrent': '8', 'bottomless_upload_throttle_hard_limit_secs': '1800', 'bottomless_upload_throttle_min_disk_limit_mb': '0', 'bottomless_upload_throttle_soft_limit_secs': '600', 'cardinality_estimation_level': '7.3', 'character_set_client': 'utf8', 'character_set_connection': 'utf8', 'character_set_filesystem': 'binary', 'character_set_results': 'utf8', 'character_set_server': 'utf8', 'character_sets_dir': '/src/memsql/release/share/charsets/', 'cluster_name': 'memsql_cluster', 'collation_connection': 'utf8_general_ci', 'collation_database': 'utf8_general_ci', 'collation_server': 'utf8_general_ci', 'collect_average_size': True, 'columnstore_flush_bytes': '33554432', 'columnstore_ingest_management_queue_timeout': '3600', 'columnstore_multicol_index': True, 'columnstore_segment_rows': '1024000', 'columnstore_unique_key_load_merge_factor': '3', 'columnstore_validate_blob_before_merge': False, 'columnstore_window_size': '2147483648', 'compat_version': '', 'compile_only': False, 'compiled_images_eviction_memory_limit_mb': '0', 'compiled_images_eviction_memory_limit_percent': '0.000000', 'connect_timeout': '10', 'consensus_enabled': False, 'core_file': True, 'core_file_mode': 'PARTIAL', 'core_on_bad_errnos_in_file_io_apis': True, 'core_on_stack_overflow': False, 'crash_on_error_code': '0', 'critical_diagnostics': True, 'critical_diagnostics_max_payload_size': '65536', 'custom_java_pipelines': False, 'data_conversion_compatibility_level': '8.0', 'datadir': '/src/memsql/release/data', 'default_autostats_enabled': False, 'default_distributed_ddl_timeout': '180000', 'default_partitions_per_leaf': '3', 'default_storage_engine': '', 'default_table_type': 'columnstore', 'default_user_require_ssl': False, 'derived_table_as_cte_seed': '0', 'disable_eventual_consistency_asserts': False, 'disk_plan_expiration_minutes': '20160', 'dist_txn_outcomes_keepalive_secs': '1800', 'distributed_commit_lock_timeout': '180000', 'enable_alias_space_trim': False, 'enable_background_plan_invalidation': False, 'enable_clone_restore': True, 'enable_columnstore_ingest_management': True, 'enable_compilation_deduplication': True, 'enable_compiled_images_eviction': True, 'enable_disk_plan_expiration': True, 'enable_disk_plan_explain': True, 'enable_dml_query_forwarding': True, 'enable_engine_idle_time_tracking': False, 'enable_experimental_metrics': False, 'enable_external_functions': True, 'enable_fips': False, 'enable_query_forwarding': True, 'enable_recursive_ctes': True, 'enable_scan_partition_parallelism': True, 'enable_spilling': True, 'enable_subprocess_tracing': False, 'enable_varbuffer_dictionary_compression': False, 'enable_wasm': True, 'expected_leaf_core_count': '8', 'explain_expression_limit': '500', 'explicit_defaults_for_timestamp': True, 'exporter_path': '', 'exporter_port': '0', 'exporter_ssl_ca': '', 'exporter_ssl_capath': '', 'exporter_ssl_cert': '', 'exporter_ssl_key': '', 'exporter_use_https': False, 'exporter_user': 'root', 'external_functions_allowlist': '', 'external_functions_batch_size': '512', 'failover_on_low_disk': True, 'flush_before_replicate': False, 'force_bushy_join_table_limit': '18', 'forwarded_query': False, 'general_log': False, 'geo_sphere_radius': '6367444.657120', 'global_versioning': True, 'group_concat_max_len': '16777216', 'gssapi_keytab_path': '/src/memsql/memsql.keytab', 'gssapi_principal_name': 'memsql/[email protected]', 'highlight_fragment_size': '100', 'highlight_max_number_fragments': '0', 'hostname': 'mate', 'http_api': True, 'http_api_max_idle_seconds': '86400', 'http_api_max_lifetime_seconds': '0', 'http_api_pool_capacity': '2048', 'http_api_pool_max_idle': '1024', 'http_api_session_vars': '', 'http_proxy_port': '8100', 'https_proxy_port': '0', 'identity': '0', 'ignore_foreign_keys': False, 'ingest_errors_max_disk_space_mb': '102400', 'interpreter_mode': 'INTERPRET_FIRST', 'java_extractor_max_memory': '500', 'java_pipelines_class_path': '', 'java_pipelines_extractor_class': '', 'java_pipelines_heap_size': '8', 'java_pipelines_java_home': '', 'java_pipelines_java_path': '', 'json_compatibility_level': '7.8', 'json_extract_string_collation': 'server_v2', 'jwks_endpoint': '', 'jwks_ssl_ca_certificate': '', 'jwks_update_interval': '3600', 'jwt_auth_config_file': '', 'large_txs_and_blobs': False, 'lc_messages': 'en_US', 'lc_messages_dir': '/src/memsql/release/share', 'leaf_failover_fanout': 'paired', 'leaf_failure_detection': False, 'load_data_cpu_arch': '0', 'load_data_errors_retention_minutes': '1440', 'load_data_internal_compression': True, 'load_data_max_buffer_size': '1073741823', 'load_data_read_size': '8192', 'load_data_write_size': '8192', 'lock_wait_timeout': '60', 'lockfree_backup': True, 'log_file_size_partitions': '16777216', 'log_file_size_ref_dbs': '16777216', 'master_aggregator': 'self', 'master_promote_kill_timeout_seconds': '10', 'max_allowed_packet': '104857600', 'max_async_compilation_concurrency': '0', 'max_compilation_memory_mb': '4096', 'max_compilation_time_s': '600', 'max_connection_threads': '256', 'max_connections': '1000', 'max_dedicated_admin_connections': '5', 'max_optimizer_join_threads': '8', 'max_pooled_connections': '128', 'max_prefetch_threads': '1', 'max_prepared_stmt_count': '16382', 'max_user_connections': '0', 'maximum_blob_cache_size_mb': '669906', 'maximum_blob_cache_size_percent': '0.000000', 'maximum_memory': '57849', 'maximum_table_memory': '52064', 'memsql_build_flavor': 'release', 'memsql_id': '47CB3085A5C6D037C8E2650978345803A4588044', 'memsql_original_server_version': '8.1.0', 'memsql_snapshot_trigger_size': '134217728', 'memsql_version': '8.1.0', 'memsql_version_date': 'Tue May 2 14:55:00 2023 -0400', 'memsql_version_hash': '7578b5100abb1d809124ecc30d89b742601840ea', 'minimal_disk_space': '100', 'monitoring_retention_period': '10080', 'multi_insert_tuple_count': '20000', 'multi_statement_xact_idle_timeout': '300', 'net_buffer_length': '102400', 'net_read_timeout': '3600', 'net_write_timeout': '3600', 'no_bottle_service_link': False, 'no_bottomless_fsyncs': False, 'node_degree_of_parallelism': '0', 'node_disk_low_threshold_mb': '10240', 'node_memory_low_threshold_mb': '512', 'node_ping_latency_high_threshold_ms': '1000.000000', 'node_replication_ssl_only': False, 'null_timestamps_for_testing': False, 'num_background_merger_threads': '2', 'page_ack_delays': False, 'parametrizer_query_max_params': '1048576', 'password_max_consec_repeat_chars': '0', 'password_max_consec_sequential_chars': '0', 'password_min_length': '0', 'password_min_lowercase_chars': '0', 'password_min_numeric_chars': '0', 'password_min_special_chars': '0', 'password_min_uppercase_chars': '0', 'pause_recovery': False, 'performance_schema': False, 'pid_file': '/src/memsql/release/data/mate.pid', 'pipelines_batches_metadata_to_keep': '1000', 'pipelines_deskew_batch_partitions_threshold': '0.750000', 'pipelines_errors_retention_minutes': '1440', 'pipelines_extractor_core_on_timeout': False, 'pipelines_extractor_debug_logging': False, 'pipelines_extractor_max_memory': '500', 'pipelines_gc_max_files': '10000', 'pipelines_kafka_version': '0.8.2.2', 'pipelines_max_concurrent': '50', 'pipelines_max_concurrent_batch_partitions': '0', 'pipelines_max_errors_per_partition': '1000', 'pipelines_max_offsets_per_batch_partition': '1000000', 'pipelines_max_pooled_extractors': '256', 'pipelines_max_retries_per_batch_partition': '4', 'pipelines_offsets_gc_skew_time_minutes': '1440', 'pipelines_pooled_extractor_batches': '256', 'pipelines_stderr_bufsize': '65535', 'pipelines_stop_on_error': True, 'pipelines_stored_proc_exactly_once': True, 'plan_expiration_minutes': '720', 'port': '9306', 'print_tree': '0', 'privilege_transfer_mode': 'grant_option', 'processlist_rpc_json_max_size': '2048', 'promote_aggregator_timeout_ms': '180000', 'protocol_version': '10', 'query_parallelism': '0', 'query_parallelism_per_leaf_core': '1.000000', 'query_shape_serialization_enabled': 'AUTO', 'query_shape_serialization_file_size_cap': '1048576', 'query_shape_serialization_show_parameters': True, 'read_advanced_counters': False, 'redundancy_level': '1', 'regexp_compile_mem_mb': '2', 'regexp_format': 'extended', 'regexp_output_validation_mode': 'throw_error', 'replstress_stackhasher': False, 'replstress_stackhasher_kill': False, 'resource_governor_cpu_limit_mode': 'SOFT', 'resource_pool': 'default_pool', 'resource_pool_statement_selector_function': '', 'resource_usage_model': False, 'result_table_error_lifetime': '1000', 'rowstore_gc_thread_count': '2', 'saml_assertion_audience': 'sp.memsql.com', 'saml_message_recipient': '', 'saml_private_decryption_key': '../memsqltest/script_tests/security/saml/files/sp.pem', 'saml_require_encryption': False, 'saml_require_signature_validation': False, 'saml_use_NameID': False, 'saml_user_name_attribute': 'roles', 'saml_x509_certificate': '../memsqltest/script_tests/security/saml/files/idp.crt', 'secure_file_priv': '', 'service_edition': 'STANDARD', 'show_query_parameters': True, 'show_with_portability_comments': False, 'singlestoredb_build_flavor': 'release', 'singlestoredb_version': '8.1.0', 'singlestoredb_version_date': 'Tue May 2 14:55:00 2023 -0400', 'singlestoredb_version_hash': '7578b5100abb1d809124ecc30d89b742601840ea', 'skip_incremental_backup_safety_checks': False, 'skip_name_resolve': 'AUTO', 'skip_segelim_with_inlist_threshold': '1000', 'skynet_fscache_sim_inject_enomem': True, 'skynet_out_of_disk_simulation': '0.000000', 'snapshot_trigger_size': '2147483648', 'snapshot_wait_for_blob_gc_seconds': '180', 'snapshots_to_keep': '2', 'socket': '/src/memsql/release/data/memsql.sock', 'some_sync_var': '0', 'sp_query_dynamic_param': 'AUTO', 'spilling_maximum_disk_percent': '-1.000000', 'spilling_minimal_disk_space': '500', 'spilling_node_memory_threshold_ratio': '0.750000', 'spilling_query_operator_memory_threshold': '104857600', 'sql_mode': 'STRICT_ALL_TABLES', 'sql_quote_show_create': True, 'sql_select_limit': '18446744073709551615', 'ssl_ca': '', 'ssl_capath': '', 'ssl_cert': '', 'ssl_cipher': '', 'ssl_fips_mode': False, 'ssl_key': '', 'ssl_last_reload_attempt_time': '', 'ssl_last_successful_reload_time': '', 'sub_to_physical_partition_ratio': '4', 'subproc_abridged_errors': True, 'subprocess_azure_retries': '10', 'subprocess_backup_retries': '3', 'subprocess_ec2_metadata_timeout_ms': '60000', 'subprocess_io_idle_timeout_ms': '600000', 'subprocess_max_retries': '10', 'sync_durable_tables': False, 'sync_permissions': True, 'sync_slave_timeout': '10000', 'system_time_zone': 'CDT', 'table_name_case_sensitivity': True, 'test_physical_threads': '0', 'test_with_auto_replicate': False, 'thread_cache_size': '0', 'thread_handling': 'one-thread-per-connection', 'thread_stack': '1048576', 'thread_stackoverflow_guard_size_factor': '2', 'time_zone': 'SYSTEM', 'tls_version': 'TLSv1,TLSv1.1,TLSv1.2', 'tmpdir': '.', 'transaction_buffer': '134217728', 'transaction_isolation': 'READ-COMMITTED', 'trim_malloc': True, 'two_phase_commit': False, 'tx_isolation': 'READ-COMMITTED', 'upgraded_variables': True, 'use_avx2': True, 'use_join_Bucket_bit_vector': True, 'use_merge_on_encoded_data': 'AUTO', 'use_vectorized_join': True, 'varchar_column_string_optimization_length': '0', 'version': '5.7.32', 'version_comment': 'SingleStoreDB source distribution (compatible; MySQL Enterprise & MySQL Commercial)', 'version_compile_machine': 'x86_64', 'version_compile_os': 'Linux', 'warn_level': 'WARNINGS', 'wasm_alloc_max_flex_size': '2097152', 'wasm_alloc_retry_count': '3', 'wasm_alloc_retry_interval': '1000', 'wasm_create_from_url': True, 'wasm_max_compiled_module_size': '26214400', 'wasm_max_image_cache_size': '26214400', 'wasm_max_linear_memory_size': '4294967295', 'wasm_max_raw_module_size': '26214400', 'workload_management': False, 'workload_management_dynamic_resource_allocation': False, 'workload_management_enable_static_partitioning': False, 'workload_management_expected_aggregators': '0', 'workload_management_max_connections_per_leaf': '10000', 'workload_management_max_queue_depth': '100', 'workload_management_max_threads_per_leaf': '8192', 'workload_management_memory_queue_threshold': '0.010000', 'workload_management_memory_queuing': True, 'workload_management_queue_size_allow_upgrade': '1', 'workload_management_queue_time_warning_ratio': '0.500000', 'workload_management_queue_timeout': '3600'}
conn.globals.enable_external_functions = True
conn.globals.enable_external_functions
True
cur.execute('show variables like "enable_external_functions"') cur.fetchall()
[{'Variable_name': 'enable_external_functions', 'Value': 'ON'}]

Enabling the HTTP API

The HTTP API can be enabled using the conn.enable_http_api method. This method can optionally set the port number. If a port number is not specified, the existing setting will be used. The port number is returned.

conn.enable_http_api(port=8100)
8100

Create an HTTP connection

s2.options.results.type = 'namedtuples'
http_conn = s2.connect('http://root:@localhost:8100/x_db')
http_cur = http_conn.cursor()

Query data using the HTTP connection

http_cur.execute('select name from employees')
10
http_cur.description
[Description(name='name', type_code=253, display_size=None, internal_size=None, precision=None, scale=None, null_ok=True, flags=0, charset=0)]
http_cur.fetchall()
[Row(name='Leonard Haynes'), Row(name='Colette Payne'), Row(name='Timothy Battle'), Row(name='Alea Wiggins'), Row(name='Cooper Hatfield'), Row(name='Rhona Nichols'), Row(name='Orli Strong'), Row(name='Karly Steele'), Row(name='Hedda Kent'), Row(name='Doris Munoz')]
http_cur.execute('select name, hireDate from employees where name like %s', ['%Rhona%'])
1
http_cur.description
[Description(name='name', type_code=253, display_size=None, internal_size=None, precision=None, scale=None, null_ok=True, flags=0, charset=0), Description(name='hireDate', type_code=10, display_size=None, internal_size=None, precision=None, scale=None, null_ok=True, flags=0, charset=0)]
df = http_cur.fetchall() df
[Row(name='Rhona Nichols', hireDate=datetime.date(2008, 9, 11))]

Workspace Management

The objects allow you to manage workspaces and create database connections to those workspaces. In order for this call to work, you either need to pass in a cluster management API token or have one set in your SINGLESTOREDB_MANAGEMENT_TOKEN environment variable.

wm = s2.manage_workspaces()

Get Current Clusters and Regions

wm.workspace_groups
[WorkspaceGroup(name='Python Client Testing', id='17b0fa56-bb7f-48d4-b351-f6296505711d', created_at=datetime.datetime(2023, 9, 26, 0, 0), firewall_ranges=['0.0.0.0/0'], region=Region(name='US East 1 (N. Virginia) - HD2', id='452cc4b1-df20-4130-9e2f-e72ba79e3d46', provider='AWS')), WorkspaceGroup(name='Group 2', id='24fe2eca-af66-45c3-918b-035780eb177a', created_at=datetime.datetime(2023, 10, 16, 0, 0), firewall_ranges=['166.113.65.66/32'], region=Region(name='US West 2 (Oregon)', id='64031b39-3da1-4a7b-8d3d-6ca86e8d71a7', provider='AWS'))]
wm.regions
[Region(name='Europe West 2 (London)', id='04eb4250-5417-4300-9822-70cf4f114543', provider='AWS'), Region(name='Asia Southeast 1 (Singapore)', id='0bbbde94-7c33-4592-95bc-470332396a82', provider='GCP'), Region(name='US West 1 (Oregon)', id='1c1de314-2cc0-4c74-bd54-5047ff90842e', provider='GCP'), Region(name='Europe West 2 (London)', id='1cbf85d7-0ae2-49ab-ad09-35fd19c02f04', provider='GCP'), Region(name='US Central 1 (Iowa)', id='22a913e1-ace5-4362-afa0-3d21a01d6d1c', provider='GCP'), Region(name='Asia South 1 (Mumbai)', id='3186a27b-87d5-4b84-a93b-90c57fd2be74', provider='GCP'), Region(name='US East 1 (N. Virginia)', id='3482219c-a389-4079-b18b-d50662524e8a', provider='AWS'), Region(name='Asia Pacific Southeast 1 (Singapore)', id='3d226d4b-90b4-4a8d-848c-6c8403ee905c', provider='AWS'), Region(name='US East 1 (N. Virginia) - HD2', id='452cc4b1-df20-4130-9e2f-e72ba79e3d46', provider='AWS'), Region(name='Europe Central 1 (Frankfurt)', id='5781fbe5-8e2d-40ee-8cfe-82d6cfe74aed', provider='AWS'), Region(name='Europe North 1 (Stockholm)', id='57ba8143-d4cc-470a-901f-871c684ee2bc', provider='AWS'), Region(name='US West 2 (Oregon)', id='64031b39-3da1-4a7b-8d3d-6ca86e8d71a7', provider='AWS'), Region(name='US East 1 (S. Carolina)', id='6de63e09-6a66-4846-9c43-cd866e906d36', provider='GCP'), Region(name='Asia Southeast 2 (Jakarta)', id='6e1b6fae-fe87-4aaa-bc6e-55d9e4c23806', provider='GCP'), Region(name='Europe West 1 (Belgium)', id='6e218f48-a71f-4521-8c7c-53d0ef087a42', provider='GCP'), Region(name='South America East 1 (Sao Paulo)', id='7b9163df-67b0-45fb-985b-e571fbcb3792', provider='AWS'), Region(name='Europe West 4 (Netherlands)', id='7edebe18-2fdf-43c3-95b9-e4e83f17c756', provider='GCP'), Region(name='Asia Pacific South 1 (Mumbai)', id='86e33c9f-a635-4320-807e-500f97cc8e1f', provider='AWS'), Region(name='Africa South 1 (Cape Town)', id='88b166e9-580a-4b26-801e-170a6bfb3e5c', provider='AWS'), Region(name='South Central US (Texas)', id='90b70be1-6ca6-48c2-9651-f40aa1fa3cee', provider='Azure'), Region(name='US East 4 (N. Virginia)', id='93b61160-0cae-4e11-a5de-977b8e2e3ee5', provider='GCP'), Region(name='Asia Pacific Southeast 3 (Jakarta)', id='976e6b6b-deb2-4041-987e-54c4641e3f7f', provider='AWS'), Region(name='Europe North 1 (Finland)', id='9939d0fb-e4f0-448c-b849-23121138e88a', provider='GCP'), Region(name='Canada Central 1 (Montreal)', id='99b1a977-cde0-496f-8c2e-0946b2f444db', provider='AWS'), Region(name='Europe West 3 (Frankfurt)', id='a6cf93ae-eaa3-4c83-a6b8-de7608d6b221', provider='GCP'), Region(name='Europe West 1 (Ireland)', id='b7bc7d58-3073-4ed6-883a-4676ddac4e46', provider='AWS'), Region(name='US East 2 (Ohio)', id='b7cfe010-204a-4b70-8611-887bf35dc184', provider='AWS'), Region(name='Europe West 3 (Paris)', id='c04471f4-5d89-46e1-8d1e-76b8282ea0df', provider='AWS'), Region(name='Asia Pacific Southeast 2 (Sydney)', id='c74bb6a6-0f14-4d56-bc80-1d874fa277b7', provider='AWS'), Region(name='Central India (Pune)', id='ccb99d91-e9da-43f9-a0ce-b7b09d2b30fb', provider='Azure'), Region(name='East US 1 (Virginia)', id='e8f6f596-6fba-4b87-adb1-7f9e960c7c78', provider='Azure'), Region(name='West Europe (Netherlands)', id='ef8d7728-c25c-465c-bd4f-46aa728c01fc', provider='Azure')]

Objects returned by WorkspaceManager.workspace_groups, WorkspaceManager.regions, and WorkspaceGroup.workspaces can be indexed by name and ID as well.

reg = wm.regions['US West 1 (Oregon)'] reg
Region(name='US West 1 (Oregon)', id='1c1de314-2cc0-4c74-bd54-5047ff90842e', provider='GCP')
reg = wm.regions['3d226d4b-90b4-4a8d-848c-6c8403ee905c'] reg
Region(name='Asia Pacific Southeast 1 (Singapore)', id='3d226d4b-90b4-4a8d-848c-6c8403ee905c', provider='AWS')

Create a Workspace Group

import secrets password = secrets.token_urlsafe(20)
wg = wm.create_workspace_group( 'Demo Workspace Group', region=[x for x in wm.regions if x.name.startswith('US')][0], admin_password=password, firewall_ranges=['0.0.0.0/0'], ) wg
WorkspaceGroup(name='Demo Workspace Group', id='b877f081-25cc-4c4c-8552-b96a919bd0aa', created_at=datetime.datetime(2023, 10, 17, 0, 0), region=Region(name='US West 1 (Oregon)', id='1c1de314-2cc0-4c74-bd54-5047ff90842e', provider='GCP'))

Create a Workspace in the Group

ws = wg.create_workspace('workspace-1', wait_on_active=True)
ws
Workspace(name='workspace-1', id='670492f6-c458-4b5d-88e6-1bf6a8d07add', created_at=datetime.datetime(2023, 10, 17, 0, 0), endpoint='svc-b877f081-25cc-4c4c-88e6-1bf6a8d07add-dml.gcp-oregon-1.svc.singlestore.com', group_id='b877f081-25cc-4c4c-8552-b96a919bd0aa', size='S-00', state='ACTIVE')

Connect to the Workspace

with ws.connect(user='admin', password=password) as conn: with conn.cursor() as cur: cur.execute('show databases') print(cur.fetchall())
[Row(Database='cluster'), Row(Database='information_schema'), Row(Database='memsql')]
ws.terminate(wait_on_terminated=True)

Working with Stages

Stages is a place when you can store data and SQL files for use in your database. Stage files can be written to interactively, or be used as files by other Python objects. Stages are created on a per-workspacegroup basis, so we'll use the workspace group created above.

Currently there are no files in our Stage.

wg.stages.listdir()
[]

Uploading a file

The most basic usage of stages is to upload files from disk. We will upload a local file named test.sql to the stage_test.sql file.

f = wg.stages.upload_file('test.sql', 'stage_test.sql') f
StagesObject(name='stage_test.sql', created_at=datetime.datetime(2023, 10, 17, 0, 0), format='text', last_modified_at=datetime.datetime(2023, 10, 17, 0, 0), mimetype='text/plain', path=PurePosixPath('stage_test.sql'), size=88, type='file', writable=True)
wg.stages.listdir()
['stage_test.sql']

Information about the file can be accessed using the resulting object's attributes and methods.

print('name', f.name) print('created at', f.created_at) print('is writable?', f.writable) print('is dir?', f.is_dir())
name stage_test.sql created at 2023-10-17 00:00:00 is writable? True is dir? False

Downloading content

The file contents can also be downloaded. Note that the returned content will by in bytes unless you specify an encoding= parameter.

print(f.download(encoding='utf-8'))
CREATE TABLE people ( id BIGINT, first_name CHAR(30), last_name CHAR(50) );

Using the file interface

It is also possible to treat a Stage file as a local file using Python's file interface. Note that this creates a temporary in-memory file until the final read / write is completed, then the data is synced.

with wg.stages.open('stage_open_test.csv', 'w') as wfile: wfile.write('A,B,C\n') wfile.write('1,2,3\n') wfile.write('4,5,6\n')
print(wg.stages.download('stage_open_test.csv', encoding='utf-8'))
A,B,C 1,2,3 4,5,6

Since the open method returns an object that supports Python's file interface, we can use it from other APIs as well such as pandas' read_csv function.

import pandas as pd with wg.stages.open('stage_open_test.csv', 'r') as rfile: df = pd.read_csv(rfile) df
with wg.stages.open('stage_pandas_test.csv', 'w') as wfile: df.to_csv(wfile, index=False) print(wg.stages.download('stage_pandas_test.csv', encoding='utf-8'))
A,B,C 1,2,3 4,5,6

Clean up

wg.terminate()