What is AMSDAL Glue?¶
AMSDAL Glue is a flexible, lightweight, powerful and easy to use opensource ETL interface and configurable implementation, built on top of the Python programming language, designed to separate analytics, applications and ORMs from data, operate across multiple disperate databases or other data stores, all through a common interface, using SQL, Python objects, or custom parser, query and command planners.
Introduction¶
AMSDAL Glue is a simple yet critical data technology that has value in almost any data applications.
In it's most basic sense, AMSDAL Glue provides users a common interface to plan and execute complex data commands and queries, as well as services to manage them, without limiting the feature-set and funcionality of the underlying data technology itself.
This unified, flexible, all-important ETL technology serves as a powerful tool to simplify the development and maintenance process of database-related tasks in Python applications, reducing the cost of ownership and increasing the scalability.
Common ETL Interface¶
We wanted a unified way to interact with all of our data (SQL, NoSQL, unstructured, integrations, etc), allowing us to build application and analytics that utilize a standardized ETL library and are clearly delineated from their data.
Multiple Simultaneous Connections¶
We wanted to be able to connect with or integrate to multiple data sources simultaneously (e.g. join data from seperate data sources), allowing our application to be agnostic to where it's data is physically located (now or in the future).
Flexibility & Performance¶
We wanted our ETL library to be highly flexible and as low a level of abstraction as possible so that it didn't dillute the power of the specifics of each database type; at it's core amsdal-glue-core simply provides us a common interface to interact with our data, allowing us to delegate as much of the workload to the underlying database technology as possible.
In essence, by detaching your application from its data, AMSDAL Glue simplifies the process of writing and maintaining database-related code with a minimal sacrifice of flexibility or perfmance.
For more information, please refer to the api.
Installation¶
The AMSDAL Glue project consists of three main packages: amsdal-glue-core
, amsdal-glue-connections
,
and amsdal-glue
. You can install each package separately or install the amsdal-glue
package, which includes all the
dependencies.
You can install the AMSDAL Glue project using pip
:
pip install amsdal-glue
This command automatically installs amsdal-glue-core
and amsdal-glue-connections
packages as dependencies.
Note, the amsdal-glue-connections
will be installed without extra dependencies.
If you want to use PostgresConnection
in order to connect to postgres database, you need explicitly install
the postgres
or postgres-binary
extra dependency:
# It will install the `psycopg` package, usually it will build from source, that is slower than the `postgres-binary`
# package, but it is recommended for production use.
pip install amsdal-glue[postgres]
# Or you can install the `psycopg-binary` dependency, it is faster than the `psycopg` package installation.
pip install amsdal-glue[postgres-binary]
Usage¶
Here is a simple example of how to use the amsdal-glue
package to connect to a SQLite database and execute a query:
import amsdal_glue
from amsdal_glue import (
init_default_containers,
QueryStatement,
FieldReference,
Field,
OrderByQuery,
OrderDirection,
SchemaReference,
Version,
DataQueryOperation,
)
def main() -> None:
init_default_containers()
# Register a connection to a SQLite database
connection_mng = amsdal_glue.Container.managers.get(amsdal_glue.ConnectionManager)
connection_mng.register_connection_pool(
amsdal_glue.DefaultConnectionPool(
amsdal_glue.SqliteConnection,
db_path='customers.sqlite',
check_same_thread=False, # The default parallel executor works on top of threads
),
)
# Build a query
query = QueryStatement(
only=[
FieldReference(field=Field(name='id'), table_name='c'),
FieldReference(field=Field(name='first_name'), table_name='c'),
],
table=SchemaReference(name='customers', alias='c', version=Version.LATEST),
order_by=[
OrderByQuery(
field=FieldReference(field=Field(name='id'), table_name='c'),
direction=OrderDirection.ASC,
),
],
)
# Execute the query
service = amsdal_glue.Container.services.get(amsdal_glue.DataQueryService)
data_result = service.execute(
query_op=DataQueryOperation(
query=query,
),
)
assert data_result.success is True
assert [item.data for item in data_result.data] == [
{'id': 1, 'first_name': 'John'},
{'id': 2, 'first_name': 'Robert'},
{'id': 3, 'first_name': 'David'},
{'id': 4, 'first_name': 'John'},
{'id': 5, 'first_name': 'Betty'},
]
import amsdal_glue
from amsdal_glue import (
init_default_containers,
QueryStatement,
FieldReference,
Field,
OrderByQuery,
OrderDirection,
SchemaReference,
Version,
DataQueryOperation,
)
async def main() -> None:
init_default_containers()
# Register a connection to a SQLite database
connection_mng = amsdal_glue.Container.managers.get(amsdal_glue.AsyncConnectionManager)
connection_mng.register_connection_pool(
amsdal_glue.DefaultAsyncConnectionPool(
amsdal_glue.AsyncSqliteConnection,
db_path='customers.sqlite',
check_same_thread=False, # The default parallel executor works on top of threads
),
)
# Build a query
query = QueryStatement(
only=[
FieldReference(field=Field(name='id'), table_name='c'),
FieldReference(field=Field(name='first_name'), table_name='c'),
],
table=SchemaReference(name='customers', alias='c', version=Version.LATEST),
order_by=[
OrderByQuery(
field=FieldReference(field=Field(name='id'), table_name='c'),
direction=OrderDirection.ASC,
),
],
)
# Execute the query
service = amsdal_glue.Container.services.get(amsdal_glue.AsyncDataQueryService)
data_result = await service.execute(
query_op=DataQueryOperation(
query=query,
),
)
assert data_result.success is True
assert [item.data for item in data_result.data] == [
{'id': 1, 'first_name': 'John'},
{'id': 2, 'first_name': 'Robert'},
{'id': 3, 'first_name': 'David'},
{'id': 4, 'first_name': 'John'},
{'id': 5, 'first_name': 'Betty'},
]
This example demonstrates how to connect to a SQLite database, build a query, and execute it using the AMSDAL Glue. For more examples and detailed documentation, please see the Examples and api sections.
Performance¶
AMSDAL Glue is designed to be highly performant. At it's core amsdal-glue-core is simply an interface, allowing you to delegate as much of the workload to the underlying database technology as possible.
In practice, AMSDAL Glue comes at relatively little cost as seen in our benchmarking results
You can find more benchmarking results here
Roadmap¶
The AMSDAL Glue project is under active development, and we have plans to add more features and improvements in the future. Some of the planned features include:
- Support for data transformation/hooks (coming soon)
- Support for more database types:
- ✅SQLite
- ✅PostgreSQL
- ⬜MySQL
- ⬜MongoDB
- ⬜Iceberg support
- ✅ SQL to AMSDAL Glue query & command translator to have an ability to integrate with any existing SQL ORM library
- Add support for more complex queries and commands
License¶
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.