Drew Bednar 5258f41475 | 2 weeks ago | |
---|---|---|
.gitignore | 2 weeks ago | |
README.md | 2 weeks ago |
README.md
High Performance SQLite3
https://highperformancesqlite.com
brew install sqlite3
Dot Commands
Be sure to learn the .commands for sqlite. In a session you can see theses with
.help
You can also execute these commands from the commandline like so
sqlite3 your_database.db -cmd ".tables"
PRAGMA Statements
https://www.sqlite.org/pragma.html
The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library.
To see all the pragmas supported:
sqlite> pragma pragma_list;
sqlite> pragma page_size;
4096
sqlite> pragma page_count;
5
Sure enough 4096 * 5 = 20480 bytes as we can see with ls
-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db
Some of these are connection only, other's are persistent. Example busy_timeout
will only be non-persistent.
An example of an important and persistent setting is the journal_mode
. This is rollback mode where it will delete the journal when it's done. WAL
mode is an alternative we will look into.
sqlite> pragma journal_mode;
delete
A VERY important one to consider is the pragma compile_options;
Where you can see how this database was compiled.
Example, this shows that the brew version was compiled with Full Text Search:
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS4
ENABLE_FTS5
If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions.
Another useful dot command to consider is the .dbconfig but you get less information.
sqlite> .dbconfig
defensive on
dqs_ddl off
dqs_dml off
enable_fkey off
enable_qpsg off
enable_trigger on
enable_view on
fts3_tokenizer off
legacy_alter_table off
legacy_file_format off
load_extension on
no_ckpt_on_close off
reset_database off
reverse_scanorder off
stmt_scanstatus off
trigger_eqp off
trusted_schema off
writable_schema off
Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on.
sqlite> pragma foreign_keys;
0
We can simply turn that on with
pragma foreign_keys=1;
This is a session level setting. You will need set this at the beginning off every connection.
Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction.
Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database.
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -2000; -- 2MB
PRAGMA temp_store = MEMORY;
PRAGMA foreign_keys = ON;
PRAGMA mmap_size = 268435456; -- 256MB
PRAGMA optimize;
PRAGMA busy_timeout = 30000; -- 30 seconds
PRAGMA automatic_index = ON;
PRAGMA locking_mode = EXCLUSIVE;
Remember to look these up.
Loadable extensions
https://www.sqlite.org/loadext.html Additional Extensions: https://github.com/nalgeon/sqlean
You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table.
Download the extension source.
A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext.
git clone https://github.com/sqlite/sqlite && cd sqlite
Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension.
git checkout version-3.46.0
cd ext/misc/
Using the example command we see from compilation example
gcc -g -fPIC -shared YourCode.c -o YourCode.so
We will compile the csv.so object.
gcc -g -fPIC -shared csv.c -o csv.so
Now create a new directory to share this with all users and relocate the object.
mkdir /usr/local/lib/sqlite3-ext/
sudo mv csv.so /usr/local/lib/sqlite3-ext/
Be sure to add this new directory to your LD_LIBRARY_PATH
in your .bashrc
file.
export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH
source ~/.bashrc
and now you should be able to load this extenstion from the sqlite3 cli.
sqlite> .load csv
sqlite>
There are other extensions that add user authentication to a sqlite database, there is also an extension that can encrypt the file at rest.
Virtual Tables
Allow you to query a table that is not actually in your database.
Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database.
Just create some data:
cat <<EOF > ./data/mydata.csv
id,name,level,speed,power
1,Morbinian,20,44,65
2,Tomoro,15,23,23
3,Wonku,99,87,93
EOF
Now we can create a virtual table using the contents, and query them just like you would expect.
sqlite> .load csv
sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true);
sqlite> .mode box
sqlite> SELECT * FROM npcs;
┌────┬───────────┬───────┬───────┬───────┐
│ id │ name │ level │ speed │ power │
├────┼───────────┼───────┼───────┼───────┤
│ 1 │ Morbinian │ 20 │ 44 │ 65 │
│ 2 │ Tomoro │ 15 │ 23 │ 23 │
│ 3 │ Wonku │ 99 │ 87 │ 93 │
└────┴───────────┴───────┴───────┴───────┘
There are other virtual tricks you can use too like
SELECT * FROM generate_series(1,100);
Which you could use in a join, or filter statement.
Look at json_tree
and json_each
functions also. Another virtual table backed feature is Full Text Search 5.
Floating Point
There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision.
Let's look at the example of floating point precision.
CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3), (-10.52), (-15.78);
sqlite> .mode box
sqlite> SELECT val FROM floats;
┌────────┐
│ val │
├────────┤
│ 26.3 │
│ -10.52 │
│ -15.78 │
└────────┘
sqlite> SELECT sum(val) FROM floats;
┌──────────────────────┐
│ sum(val) │
├──────────────────────┤
│ 1.77635683940025e-15 │
└──────────────────────┘
If you do this on paper 26.3 + -10.52 + -15.78 = 0
it should have come out to zero.
There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary.
sqlite> SELECT decimal_sum(val) FROM floats;
┌──────────────────┐
│ decimal_sum(val) │
├──────────────────┤
│ 0.00 │
└──────────────────┘
Optimizing SQLite
Locking
SQLite can only support one writer process at a time, but it can support many readers at any given time.
Locking States:
- Unlocked: No reading / writing
- Shared Lock: Process will acquire a shared lock and all other processes can read the DB too, but no process can write
- Reserved Lock: Process announces it is going to write. In the reserved lock state new shared locks can be acquired.
- Pending: No new readers can start. We are waiting for all the open shared locks to close out.
- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again.
Rollback mode
Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process.
WAL mode (Write Ahead Log)
The most important thing for SQLite Performance
pragma journal_mode;
To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting.
In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.# High Performance SQLite
https://highperformancesqlite.com
brew install sqlite3
Dot Commands
Be sure to learn the .commands for sqlite. In a session you can see theses with
.help
You can also execute these commands from the commandline like so
sqlite3 your_database.db -cmd ".tables"
PRAGMA Statements
https://www.sqlite.org/pragma.html
The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library.
To see all the pragmas supported:
sqlite> pragma pragma_list;
sqlite> pragma page_size;
4096
sqlite> pragma page_count;
5
Sure enough 4096 * 5 = 20480 bytes as we can see with ls
-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db
Some of these are connection only, other's are persistent. Example busy_timeout
will only be non-persistent.
An example of an important and persistent setting is the journal_mode
. This is rollback mode where it will delete the journal when it's done. WAL
mode is an alternative we will look into.
sqlite> pragma journal_mode;
delete
A VERY important one to consider is the pragma compile_options;
Where you can see how this database was compiled.
Example, this shows that the brew version was compiled with Full Text Search:
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS4
ENABLE_FTS5
If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions.
Another useful dot command to consider is the .dbconfig but you get less information.
sqlite> .dbconfig
defensive on
dqs_ddl off
dqs_dml off
enable_fkey off
enable_qpsg off
enable_trigger on
enable_view on
fts3_tokenizer off
legacy_alter_table off
legacy_file_format off
load_extension on
no_ckpt_on_close off
reset_database off
reverse_scanorder off
stmt_scanstatus off
trigger_eqp off
trusted_schema off
writable_schema off
Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on.
sqlite> pragma foreign_keys;
0
We can simply turn that on with
pragma foreign_keys=1;
This is a session level setting. You will need set this at the beginning off every connection.
Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction.
Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database.
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -2000; -- 2MB
PRAGMA temp_store = MEMORY;
PRAGMA foreign_keys = ON;
PRAGMA mmap_size = 268435456; -- 256MB
PRAGMA optimize;
PRAGMA busy_timeout = 30000; -- 30 seconds
PRAGMA automatic_index = ON;
PRAGMA locking_mode = EXCLUSIVE;
Remember to look these up.
Loadable extensions
https://www.sqlite.org/loadext.html Additional Extensions: https://github.com/nalgeon/sqlean
You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table.
Download the extension source.
A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext.
git clone https://github.com/sqlite/sqlite && cd sqlite
Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension.
git checkout version-3.46.0
cd ext/misc/
Using the example command we see from compilation example
gcc -g -fPIC -shared YourCode.c -o YourCode.so
We will compile the csv.so object.
gcc -g -fPIC -shared csv.c -o csv.so
Now create a new directory to share this with all users and relocate the object.
mkdir /usr/local/lib/sqlite3-ext/
sudo mv csv.so /usr/local/lib/sqlite3-ext/
Be sure to add this new directory to your LD_LIBRARY_PATH
in your .bashrc
file.
export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH
source ~/.bashrc
and now you should be able to load this extenstion from the sqlite3 cli.
sqlite> .load csv
sqlite>
There are other extensions that add user authentication to a sqlite database, there is also an extension that can encrypt the file at rest.
Virtual Tables
Allow you to query a table that is not actually in your database.
Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database.
Just create some data:
cat <<EOF > ./data/mydata.csv
id,name,level,speed,power
1,Morbinian,20,44,65
2,Tomoro,15,23,23
3,Wonku,99,87,93
EOF
Now we can create a virtual table using the contents, and query them just like you would expect.
sqlite> .load csv
sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true);
sqlite> .mode box
sqlite> SELECT * FROM npcs;
┌────┬───────────┬───────┬───────┬───────┐
│ id │ name │ level │ speed │ power │
├────┼───────────┼───────┼───────┼───────┤
│ 1 │ Morbinian │ 20 │ 44 │ 65 │
│ 2 │ Tomoro │ 15 │ 23 │ 23 │
│ 3 │ Wonku │ 99 │ 87 │ 93 │
└────┴───────────┴───────┴───────┴───────┘
There are other virtual tricks you can use too like
SELECT * FROM generate_series(1,100);
Which you could use in a join, or filter statement.
Look at json_tree
and json_each
functions also. Another virtual table backed feature is Full Text Search 5.
Floating Point
There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision.
Let's look at the example of floating point precision.
CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3), (-10.52), (-15.78);
sqlite> .mode box
sqlite> SELECT val FROM floats;
┌────────┐
│ val │
├────────┤
│ 26.3 │
│ -10.52 │
│ -15.78 │
└────────┘
sqlite> SELECT sum(val) FROM floats;
┌──────────────────────┐
│ sum(val) │
├──────────────────────┤
│ 1.77635683940025e-15 │
└──────────────────────┘
If you do this on paper 26.3 + -10.52 + -15.78 = 0
it should have come out to zero.
There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary.
sqlite> SELECT decimal_sum(val) FROM floats;
┌──────────────────┐
│ decimal_sum(val) │
├──────────────────┤
│ 0.00 │
└──────────────────┘
Optimizing SQLite
Locking
SQLite can only support one writer process at a time, but it can support many readers at any given time.
Locking States:
- Unlocked: No reading / writing
- Shared Lock: Process will acquire a shared lock and all other processes can read the DB too, but no process can write
- Reserved Lock: Process announces it is going to write. In the reserved lock state new shared locks can be acquired.
- Pending: No new readers can start. We are waiting for all the open shared locks to close out.
- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again.
Rollback mode
Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process.
WAL mode (Write Ahead Log)
The most important thing for SQLite Performance
pragma journal_mode;
To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting.
In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.# High Performance SQLite
https://highperformancesqlite.com
brew install sqlite3
Dot Commands
Be sure to learn the .commands for sqlite. In a session you can see theses with
.help
You can also execute these commands from the commandline like so
sqlite3 your_database.db -cmd ".tables"
PRAGMA Statements
https://www.sqlite.org/pragma.html
The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library.
To see all the pragmas supported:
sqlite> pragma pragma_list;
sqlite> pragma page_size;
4096
sqlite> pragma page_count;
5
Sure enough 4096 * 5 = 20480 bytes as we can see with ls
-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db
Some of these are connection only, other's are persistent. Example busy_timeout
will only be non-persistent.
An example of an important and persistent setting is the journal_mode
. This is rollback mode where it will delete the journal when it's done. WAL
mode is an alternative we will look into.
sqlite> pragma journal_mode;
delete
A VERY important one to consider is the pragma compile_options;
Where you can see how this database was compiled.
Example, this shows that the brew version was compiled with Full Text Search:
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS4
ENABLE_FTS5
If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions.
Another useful dot command to consider is the .dbconfig but you get less information.
sqlite> .dbconfig
defensive on
dqs_ddl off
dqs_dml off
enable_fkey off
enable_qpsg off
enable_trigger on
enable_view on
fts3_tokenizer off
legacy_alter_table off
legacy_file_format off
load_extension on
no_ckpt_on_close off
reset_database off
reverse_scanorder off
stmt_scanstatus off
trigger_eqp off
trusted_schema off
writable_schema off
Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on.
sqlite> pragma foreign_keys;
0
We can simply turn that on with
pragma foreign_keys=1;
This is a session level setting. You will need set this at the beginning off every connection.
Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction.
Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database.
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -2000; -- 2MB
PRAGMA temp_store = MEMORY;
PRAGMA foreign_keys = ON;
PRAGMA mmap_size = 268435456; -- 256MB
PRAGMA optimize;
PRAGMA busy_timeout = 30000; -- 30 seconds
PRAGMA automatic_index = ON;
PRAGMA locking_mode = EXCLUSIVE;
Remember to look these up.
Loadable extensions
https://www.sqlite.org/loadext.html Additional Extensions: https://github.com/nalgeon/sqlean
You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table.
Download the extension source.
A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext.
git clone https://github.com/sqlite/sqlite && cd sqlite
Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension.
git checkout version-3.46.0
cd ext/misc/
Using the example command we see from compilation example
gcc -g -fPIC -shared YourCode.c -o YourCode.so
We will compile the csv.so object.
gcc -g -fPIC -shared csv.c -o csv.so
Now create a new directory to share this with all users and relocate the object.
mkdir /usr/local/lib/sqlite3-ext/
sudo mv csv.so /usr/local/lib/sqlite3-ext/
Be sure to add this new directory to your LD_LIBRARY_PATH
in your .bashrc
file.
export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH
source ~/.bashrc
and now you should be able to load this extenstion from the sqlite3 cli.
sqlite> .load csv
sqlite>
There are other extensions that add user authentication to a sqlite database, there is also an extension that can encrypt the file at rest.
Virtual Tables
Allow you to query a table that is not actually in your database.
Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database.
Just create some data:
cat <<EOF > ./data/mydata.csv
id,name,level,speed,power
1,Morbinian,20,44,65
2,Tomoro,15,23,23
3,Wonku,99,87,93
EOF
Now we can create a virtual table using the contents, and query them just like you would expect.
sqlite> .load csv
sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true);
sqlite> .mode box
sqlite> SELECT * FROM npcs;
┌────┬───────────┬───────┬───────┬───────┐
│ id │ name │ level │ speed │ power │
├────┼───────────┼───────┼───────┼───────┤
│ 1 │ Morbinian │ 20 │ 44 │ 65 │
│ 2 │ Tomoro │ 15 │ 23 │ 23 │
│ 3 │ Wonku │ 99 │ 87 │ 93 │
└────┴───────────┴───────┴───────┴───────┘
There are other virtual tricks you can use too like
SELECT * FROM generate_series(1,100);
Which you could use in a join, or filter statement.
Look at json_tree
and json_each
functions also. Another virtual table backed feature is Full Text Search 5.
Floating Point
There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision.
Let's look at the example of floating point precision.
CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3), (-10.52), (-15.78);
sqlite> .mode box
sqlite> SELECT val FROM floats;
┌────────┐
│ val │
├────────┤
│ 26.3 │
│ -10.52 │
│ -15.78 │
└────────┘
sqlite> SELECT sum(val) FROM floats;
┌──────────────────────┐
│ sum(val) │
├──────────────────────┤
│ 1.77635683940025e-15 │
└──────────────────────┘
If you do this on paper 26.3 + -10.52 + -15.78 = 0
it should have come out to zero.
There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary.
sqlite> SELECT decimal_sum(val) FROM floats;
┌──────────────────┐
│ decimal_sum(val) │
├──────────────────┤
│ 0.00 │
└──────────────────┘
Optimizing SQLite
Locking
SQLite can only support one writer process at a time, but it can support many readers at any given time.
Locking States:
- Unlocked: No reading / writing
- Shared Lock: Process will acquire a shared lock and all other processes can read the DB too, but no process can write
- Reserved Lock: Process announces it is going to write. In the reserved lock state new shared locks can be acquired.
- Pending: No new readers can start. We are waiting for all the open shared locks to close out.
- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again.
Rollback mode
Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process.
WAL mode (Write Ahead Log)
The most important thing for SQLite Performance
pragma journal_mode;
To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting.
In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.# High Performance SQLite
https://highperformancesqlite.com
brew install sqlite3
Dot Commands
Be sure to learn the .commands for sqlite. In a session you can see theses with
.help
You can also execute these commands from the commandline like so
sqlite3 your_database.db -cmd ".tables"
PRAGMA Statements
https://www.sqlite.org/pragma.html
The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library.
To see all the pragmas supported:
sqlite> pragma pragma_list;
sqlite> pragma page_size;
4096
sqlite> pragma page_count;
5
Sure enough 4096 * 5 = 20480 bytes as we can see with ls
-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db
Some of these are connection only, other's are persistent. Example busy_timeout
will only be non-persistent.
An example of an important and persistent setting is the journal_mode
. This is rollback mode where it will delete the journal when it's done. WAL
mode is an alternative we will look into.
sqlite> pragma journal_mode;
delete
A VERY important one to consider is the pragma compile_options;
Where you can see how this database was compiled.
Example, this shows that the brew version was compiled with Full Text Search:
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS4
ENABLE_FTS5
If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions.
Another useful dot command to consider is the .dbconfig but you get less information.
sqlite> .dbconfig
defensive on
dqs_ddl off
dqs_dml off
enable_fkey off
enable_qpsg off
enable_trigger on
enable_view on
fts3_tokenizer off
legacy_alter_table off
legacy_file_format off
load_extension on
no_ckpt_on_close off
reset_database off
reverse_scanorder off
stmt_scanstatus off
trigger_eqp off
trusted_schema off
writable_schema off
Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on.
sqlite> pragma foreign_keys;
0
We can simply turn that on with
pragma foreign_keys=1;
This is a session level setting. You will need set this at the beginning off every connection.
Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction.
Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database.
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -2000; -- 2MB
PRAGMA temp_store = MEMORY;
PRAGMA foreign_keys = ON;
PRAGMA mmap_size = 268435456; -- 256MB
PRAGMA optimize;
PRAGMA busy_timeout = 30000; -- 30 seconds
PRAGMA automatic_index = ON;
PRAGMA locking_mode = EXCLUSIVE;
Remember to look these up.
Loadable extensions
https://www.sqlite.org/loadext.html Additional Extensions: https://github.com/nalgeon/sqlean
You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table.
Download the extension source.
A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext.
git clone https://github.com/sqlite/sqlite && cd sqlite
Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension.
git checkout version-3.46.0
cd ext/misc/
Using the example command we see from compilation example
gcc -g -fPIC -shared YourCode.c -o YourCode.so
We will compile the csv.so object.
gcc -g -fPIC -shared csv.c -o csv.so
Now create a new directory to share this with all users and relocate the object.
mkdir /usr/local/lib/sqlite3-ext/
sudo mv csv.so /usr/local/lib/sqlite3-ext/
Be sure to add this new directory to your LD_LIBRARY_PATH
in your .bashrc
file.
export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH
source ~/.bashrc
and now you should be able to load this extenstion from the sqlite3 cli.
sqlite> .load csv
sqlite>
There are other extensions that add user authentication to a sqlite database, there is also an extension that can encrypt the file at rest.
Virtual Tables
Allow you to query a table that is not actually in your database.
Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database.
Just create some data:
cat <<EOF > ./data/mydata.csv
id,name,level,speed,power
1,Morbinian,20,44,65
2,Tomoro,15,23,23
3,Wonku,99,87,93
EOF
Now we can create a virtual table using the contents, and query them just like you would expect.
sqlite> .load csv
sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true);
sqlite> .mode box
sqlite> SELECT * FROM npcs;
┌────┬───────────┬───────┬───────┬───────┐
│ id │ name │ level │ speed │ power │
├────┼───────────┼───────┼───────┼───────┤
│ 1 │ Morbinian │ 20 │ 44 │ 65 │
│ 2 │ Tomoro │ 15 │ 23 │ 23 │
│ 3 │ Wonku │ 99 │ 87 │ 93 │
└────┴───────────┴───────┴───────┴───────┘
There are other virtual tricks you can use too like
SELECT * FROM generate_series(1,100);
Which you could use in a join, or filter statement.
Look at json_tree
and json_each
functions also. Another virtual table backed feature is Full Text Search 5.
Floating Point
There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision.
Let's look at the example of floating point precision.
CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL);
INSERT INTO floats VALUES (26.3), (-10.52), (-15.78);
sqlite> .mode box
sqlite> SELECT val FROM floats;
┌────────┐
│ val │
├────────┤
│ 26.3 │
│ -10.52 │
│ -15.78 │
└────────┘
sqlite> SELECT sum(val) FROM floats;
┌──────────────────────┐
│ sum(val) │
├──────────────────────┤
│ 1.77635683940025e-15 │
└──────────────────────┘
If you do this on paper 26.3 + -10.52 + -15.78 = 0
it should have come out to zero.
There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary.
sqlite> SELECT decimal_sum(val) FROM floats;
┌──────────────────┐
│ decimal_sum(val) │
├──────────────────┤
│ 0.00 │
└──────────────────┘
Optimizing SQLite
Locking
SQLite can only support one writer process at a time, but it can support many readers at any given time.
Locking States:
- Unlocked: No reading / writing
- Shared Lock: Process will acquire a shared lock and all other processes can read the DB too, but no process can write
- Reserved Lock: Process announces it is going to write. In the reserved lock state new shared locks can be acquired.
- Pending: No new readers can start. We are waiting for all the open shared locks to close out.
- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again.
Rollback mode
Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process.
WAL mode (Write Ahead Log)
The most important thing for SQLite Performance
pragma journal_mode;
To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting.
In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.