From 5258f4147541899f7fc683329efd4ad3a90a169b Mon Sep 17 00:00:00 2001 From: Drew Bednar Date: Sat, 18 Jan 2025 14:42:27 -0500 Subject: [PATCH] Saving sqlite3 notes --- .gitignore | 2 + README.md | 1189 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 1191 insertions(+) create mode 100644 .gitignore create mode 100644 README.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..6dbccb1 --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +data/ + diff --git a/README.md b/README.md new file mode 100644 index 0000000..7c404c9 --- /dev/null +++ b/README.md @@ -0,0 +1,1189 @@ +# High Performance SQLite3 + +https://highperformancesqlite.com + +``` +brew install sqlite3 +``` + +## Dot Commands + +Be sure to learn the .commands for sqlite. In a session you can see theses with + +``` +.help +``` + +You can also execute these commands from the commandline like so + +``` +sqlite3 your_database.db -cmd ".tables" + +``` + +## PRAGMA Statements + +https://www.sqlite.org/pragma.html + +The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library. + + +To see all the pragmas supported: +``` +sqlite> pragma pragma_list; +``` + +``` +sqlite> pragma page_size; +4096 +``` + +``` +sqlite> pragma page_count; +5 +``` + +Sure enough 4096 * 5 = 20480 bytes as we can see with `ls` + +``` +-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db +``` + +Some of these are connection only, other's are persistent. Example `busy_timeout` will only be non-persistent. + + +An example of an important and persistent setting is the `journal_mode`. This is rollback mode where it will delete the journal when it's done. `WAL` mode is an alternative we will look into. + +``` +sqlite> pragma journal_mode; +delete +``` + +A VERY important one to consider is the `pragma compile_options;` Where you can see how this database was compiled. + +Example, this shows that the brew version was compiled with Full Text Search: + +``` +ENABLE_FTS3 +ENABLE_FTS3_PARENTHESIS +ENABLE_FTS4 +ENABLE_FTS5 +``` + +If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions. + + +Another useful dot command to consider is the .dbconfig but you get less information. +``` +sqlite> .dbconfig + defensive on + dqs_ddl off + dqs_dml off + enable_fkey off + enable_qpsg off + enable_trigger on + enable_view on + fts3_tokenizer off + legacy_alter_table off + legacy_file_format off + load_extension on + no_ckpt_on_close off + reset_database off + reverse_scanorder off + stmt_scanstatus off + trigger_eqp off + trusted_schema off + writable_schema off +``` + + +Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on. +``` +sqlite> pragma foreign_keys; +0 +``` + +We can simply turn that on with + +``` +pragma foreign_keys=1; +``` + +This is a session level setting. You will need set this at the beginning off every connection. + +Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction. + +Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database. + +``` +PRAGMA journal_mode = WAL; +PRAGMA synchronous = NORMAL; +PRAGMA cache_size = -2000; -- 2MB +PRAGMA temp_store = MEMORY; +PRAGMA foreign_keys = ON; +PRAGMA mmap_size = 268435456; -- 256MB +PRAGMA optimize; +PRAGMA busy_timeout = 30000; -- 30 seconds +PRAGMA automatic_index = ON; +PRAGMA locking_mode = EXCLUSIVE; +``` + +Remember to look these up. + +## Loadable extensions + +https://www.sqlite.org/loadext.html +Additional Extensions: https://github.com/nalgeon/sqlean + +You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table. + +### Download the extension source. + +A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext. + +``` +git clone https://github.com/sqlite/sqlite && cd sqlite +``` + +Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension. + +``` +git checkout version-3.46.0 +cd ext/misc/ +``` + +Using the example command we see from compilation example + +``` +gcc -g -fPIC -shared YourCode.c -o YourCode.so +``` + +We will compile the csv.so object. + +``` +gcc -g -fPIC -shared csv.c -o csv.so +``` + +Now create a new directory to share this with all users and relocate the object. + +``` +mkdir /usr/local/lib/sqlite3-ext/ +sudo mv csv.so /usr/local/lib/sqlite3-ext/ +``` + +Be sure to add this new directory to your `LD_LIBRARY_PATH` in your `.bashrc` file. + +`export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH` + +`source ~/.bashrc` and now you should be able to load this extenstion from the sqlite3 cli. + +``` +sqlite> .load csv +sqlite> +``` + +There are other extensions that add [user authentication to a sqlite database](https://github.com/sqlite/sqlite/blob/master/ext/userauth/user-auth.txt), there is also an extension that can encrypt the file at rest. + +## Virtual Tables + +Allow you to query a table that is not actually in your database. + +Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database. + + +Just create some data: +``` +cat < ./data/mydata.csv +id,name,level,speed,power +1,Morbinian,20,44,65 +2,Tomoro,15,23,23 +3,Wonku,99,87,93 +EOF +``` + +Now we can create a virtual table using the contents, and query them just like you would expect. + +``` +sqlite> .load csv +sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true); +sqlite> .mode box +sqlite> SELECT * FROM npcs; +┌────┬───────────┬───────┬───────┬───────┐ +│ id │ name │ level │ speed │ power │ +├────┼───────────┼───────┼───────┼───────┤ +│ 1 │ Morbinian │ 20 │ 44 │ 65 │ +│ 2 │ Tomoro │ 15 │ 23 │ 23 │ +│ 3 │ Wonku │ 99 │ 87 │ 93 │ +└────┴───────────┴───────┴───────┴───────┘ +``` + +There are other virtual tricks you can use too like + +``` +SELECT * FROM generate_series(1,100); +``` + +Which you could use in a join, or filter statement. + +Look at `json_tree` and `json_each` functions also. Another virtual table backed feature is Full Text Search 5. + +## Floating Point + +There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision. + +Let's look at the example of floating point precision. + +``` +CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3), (-10.52), (-15.78); +sqlite> .mode box +sqlite> SELECT val FROM floats; +┌────────┐ +│ val │ +├────────┤ +│ 26.3 │ +│ -10.52 │ +│ -15.78 │ +└────────┘ + +sqlite> SELECT sum(val) FROM floats; +┌──────────────────────┐ +│ sum(val) │ +├──────────────────────┤ +│ 1.77635683940025e-15 │ +└──────────────────────┘ +``` + +If you do this on paper `26.3 + -10.52 + -15.78 = 0` it should have come out to zero. + +There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary. + +``` +sqlite> SELECT decimal_sum(val) FROM floats; +┌──────────────────┐ +│ decimal_sum(val) │ +├──────────────────┤ +│ 0.00 │ +└──────────────────┘ +``` + +## Optimizing SQLite + +### Locking + +SQLite can only support one writer process at a time, but it can support many readers at any given time. + +Locking States: +- **Unlocked**: No reading / writing +- **Shared Lock**: Process will acquire a shared lock and all other processes can read the DB too, but no process can write +- **Reserved Lock**: Process announces it is going to write. In the reserved lock state new shared locks can be acquired. +- **Pending**: No new readers can start. We are waiting for all the open shared locks to close out. +- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again. + +### Rollback mode + +Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process. + +### WAL mode (Write Ahead Log) + +The most important thing for SQLite Performance + +``` +pragma journal_mode; +``` + +To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting. + +In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.# High Performance SQLite + +https://highperformancesqlite.com + +``` +brew install sqlite3 +``` + +## Dot Commands + +Be sure to learn the .commands for sqlite. In a session you can see theses with + +``` +.help +``` + +You can also execute these commands from the commandline like so + +``` +sqlite3 your_database.db -cmd ".tables" + +``` + +## PRAGMA Statements + +https://www.sqlite.org/pragma.html + +The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library. + + +To see all the pragmas supported: +``` +sqlite> pragma pragma_list; +``` + +``` +sqlite> pragma page_size; +4096 +``` + +``` +sqlite> pragma page_count; +5 +``` + +Sure enough 4096 * 5 = 20480 bytes as we can see with `ls` + +``` +-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db +``` + +Some of these are connection only, other's are persistent. Example `busy_timeout` will only be non-persistent. + + +An example of an important and persistent setting is the `journal_mode`. This is rollback mode where it will delete the journal when it's done. `WAL` mode is an alternative we will look into. + +``` +sqlite> pragma journal_mode; +delete +``` + +A VERY important one to consider is the `pragma compile_options;` Where you can see how this database was compiled. + +Example, this shows that the brew version was compiled with Full Text Search: + +``` +ENABLE_FTS3 +ENABLE_FTS3_PARENTHESIS +ENABLE_FTS4 +ENABLE_FTS5 +``` + +If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions. + + +Another useful dot command to consider is the .dbconfig but you get less information. +``` +sqlite> .dbconfig + defensive on + dqs_ddl off + dqs_dml off + enable_fkey off + enable_qpsg off + enable_trigger on + enable_view on + fts3_tokenizer off + legacy_alter_table off + legacy_file_format off + load_extension on + no_ckpt_on_close off + reset_database off + reverse_scanorder off + stmt_scanstatus off + trigger_eqp off + trusted_schema off + writable_schema off +``` + + +Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on. +``` +sqlite> pragma foreign_keys; +0 +``` + +We can simply turn that on with + +``` +pragma foreign_keys=1; +``` + +This is a session level setting. You will need set this at the beginning off every connection. + +Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction. + +Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database. + +``` +PRAGMA journal_mode = WAL; +PRAGMA synchronous = NORMAL; +PRAGMA cache_size = -2000; -- 2MB +PRAGMA temp_store = MEMORY; +PRAGMA foreign_keys = ON; +PRAGMA mmap_size = 268435456; -- 256MB +PRAGMA optimize; +PRAGMA busy_timeout = 30000; -- 30 seconds +PRAGMA automatic_index = ON; +PRAGMA locking_mode = EXCLUSIVE; +``` + +Remember to look these up. + +## Loadable extensions + +https://www.sqlite.org/loadext.html +Additional Extensions: https://github.com/nalgeon/sqlean + +You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table. + +### Download the extension source. + +A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext. + +``` +git clone https://github.com/sqlite/sqlite && cd sqlite +``` + +Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension. + +``` +git checkout version-3.46.0 +cd ext/misc/ +``` + +Using the example command we see from compilation example + +``` +gcc -g -fPIC -shared YourCode.c -o YourCode.so +``` + +We will compile the csv.so object. + +``` +gcc -g -fPIC -shared csv.c -o csv.so +``` + +Now create a new directory to share this with all users and relocate the object. + +``` +mkdir /usr/local/lib/sqlite3-ext/ +sudo mv csv.so /usr/local/lib/sqlite3-ext/ +``` + +Be sure to add this new directory to your `LD_LIBRARY_PATH` in your `.bashrc` file. + +`export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH` + +`source ~/.bashrc` and now you should be able to load this extenstion from the sqlite3 cli. + +``` +sqlite> .load csv +sqlite> +``` + +There are other extensions that add [user authentication to a sqlite database](https://github.com/sqlite/sqlite/blob/master/ext/userauth/user-auth.txt), there is also an extension that can encrypt the file at rest. + +## Virtual Tables + +Allow you to query a table that is not actually in your database. + +Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database. + + +Just create some data: +``` +cat < ./data/mydata.csv +id,name,level,speed,power +1,Morbinian,20,44,65 +2,Tomoro,15,23,23 +3,Wonku,99,87,93 +EOF +``` + +Now we can create a virtual table using the contents, and query them just like you would expect. + +``` +sqlite> .load csv +sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true); +sqlite> .mode box +sqlite> SELECT * FROM npcs; +┌────┬───────────┬───────┬───────┬───────┐ +│ id │ name │ level │ speed │ power │ +├────┼───────────┼───────┼───────┼───────┤ +│ 1 │ Morbinian │ 20 │ 44 │ 65 │ +│ 2 │ Tomoro │ 15 │ 23 │ 23 │ +│ 3 │ Wonku │ 99 │ 87 │ 93 │ +└────┴───────────┴───────┴───────┴───────┘ +``` + +There are other virtual tricks you can use too like + +``` +SELECT * FROM generate_series(1,100); +``` + +Which you could use in a join, or filter statement. + +Look at `json_tree` and `json_each` functions also. Another virtual table backed feature is Full Text Search 5. + +## Floating Point + +There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision. + +Let's look at the example of floating point precision. + +``` +CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3), (-10.52), (-15.78); +sqlite> .mode box +sqlite> SELECT val FROM floats; +┌────────┐ +│ val │ +├────────┤ +│ 26.3 │ +│ -10.52 │ +│ -15.78 │ +└────────┘ + +sqlite> SELECT sum(val) FROM floats; +┌──────────────────────┐ +│ sum(val) │ +├──────────────────────┤ +│ 1.77635683940025e-15 │ +└──────────────────────┘ +``` + +If you do this on paper `26.3 + -10.52 + -15.78 = 0` it should have come out to zero. + +There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary. + +``` +sqlite> SELECT decimal_sum(val) FROM floats; +┌──────────────────┐ +│ decimal_sum(val) │ +├──────────────────┤ +│ 0.00 │ +└──────────────────┘ +``` + +## Optimizing SQLite + +### Locking + +SQLite can only support one writer process at a time, but it can support many readers at any given time. + +Locking States: +- **Unlocked**: No reading / writing +- **Shared Lock**: Process will acquire a shared lock and all other processes can read the DB too, but no process can write +- **Reserved Lock**: Process announces it is going to write. In the reserved lock state new shared locks can be acquired. +- **Pending**: No new readers can start. We are waiting for all the open shared locks to close out. +- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again. + +### Rollback mode + +Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process. + +### WAL mode (Write Ahead Log) + +The most important thing for SQLite Performance + +``` +pragma journal_mode; +``` + +To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting. + +In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.# High Performance SQLite + +https://highperformancesqlite.com + +``` +brew install sqlite3 +``` + +## Dot Commands + +Be sure to learn the .commands for sqlite. In a session you can see theses with + +``` +.help +``` + +You can also execute these commands from the commandline like so + +``` +sqlite3 your_database.db -cmd ".tables" + +``` + +## PRAGMA Statements + +https://www.sqlite.org/pragma.html + +The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library. + + +To see all the pragmas supported: +``` +sqlite> pragma pragma_list; +``` + +``` +sqlite> pragma page_size; +4096 +``` + +``` +sqlite> pragma page_count; +5 +``` + +Sure enough 4096 * 5 = 20480 bytes as we can see with `ls` + +``` +-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db +``` + +Some of these are connection only, other's are persistent. Example `busy_timeout` will only be non-persistent. + + +An example of an important and persistent setting is the `journal_mode`. This is rollback mode where it will delete the journal when it's done. `WAL` mode is an alternative we will look into. + +``` +sqlite> pragma journal_mode; +delete +``` + +A VERY important one to consider is the `pragma compile_options;` Where you can see how this database was compiled. + +Example, this shows that the brew version was compiled with Full Text Search: + +``` +ENABLE_FTS3 +ENABLE_FTS3_PARENTHESIS +ENABLE_FTS4 +ENABLE_FTS5 +``` + +If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions. + + +Another useful dot command to consider is the .dbconfig but you get less information. +``` +sqlite> .dbconfig + defensive on + dqs_ddl off + dqs_dml off + enable_fkey off + enable_qpsg off + enable_trigger on + enable_view on + fts3_tokenizer off + legacy_alter_table off + legacy_file_format off + load_extension on + no_ckpt_on_close off + reset_database off + reverse_scanorder off + stmt_scanstatus off + trigger_eqp off + trusted_schema off + writable_schema off +``` + + +Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on. +``` +sqlite> pragma foreign_keys; +0 +``` + +We can simply turn that on with + +``` +pragma foreign_keys=1; +``` + +This is a session level setting. You will need set this at the beginning off every connection. + +Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction. + +Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database. + +``` +PRAGMA journal_mode = WAL; +PRAGMA synchronous = NORMAL; +PRAGMA cache_size = -2000; -- 2MB +PRAGMA temp_store = MEMORY; +PRAGMA foreign_keys = ON; +PRAGMA mmap_size = 268435456; -- 256MB +PRAGMA optimize; +PRAGMA busy_timeout = 30000; -- 30 seconds +PRAGMA automatic_index = ON; +PRAGMA locking_mode = EXCLUSIVE; +``` + +Remember to look these up. + +## Loadable extensions + +https://www.sqlite.org/loadext.html +Additional Extensions: https://github.com/nalgeon/sqlean + +You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table. + +### Download the extension source. + +A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext. + +``` +git clone https://github.com/sqlite/sqlite && cd sqlite +``` + +Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension. + +``` +git checkout version-3.46.0 +cd ext/misc/ +``` + +Using the example command we see from compilation example + +``` +gcc -g -fPIC -shared YourCode.c -o YourCode.so +``` + +We will compile the csv.so object. + +``` +gcc -g -fPIC -shared csv.c -o csv.so +``` + +Now create a new directory to share this with all users and relocate the object. + +``` +mkdir /usr/local/lib/sqlite3-ext/ +sudo mv csv.so /usr/local/lib/sqlite3-ext/ +``` + +Be sure to add this new directory to your `LD_LIBRARY_PATH` in your `.bashrc` file. + +`export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH` + +`source ~/.bashrc` and now you should be able to load this extenstion from the sqlite3 cli. + +``` +sqlite> .load csv +sqlite> +``` + +There are other extensions that add [user authentication to a sqlite database](https://github.com/sqlite/sqlite/blob/master/ext/userauth/user-auth.txt), there is also an extension that can encrypt the file at rest. + +## Virtual Tables + +Allow you to query a table that is not actually in your database. + +Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database. + + +Just create some data: +``` +cat < ./data/mydata.csv +id,name,level,speed,power +1,Morbinian,20,44,65 +2,Tomoro,15,23,23 +3,Wonku,99,87,93 +EOF +``` + +Now we can create a virtual table using the contents, and query them just like you would expect. + +``` +sqlite> .load csv +sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true); +sqlite> .mode box +sqlite> SELECT * FROM npcs; +┌────┬───────────┬───────┬───────┬───────┐ +│ id │ name │ level │ speed │ power │ +├────┼───────────┼───────┼───────┼───────┤ +│ 1 │ Morbinian │ 20 │ 44 │ 65 │ +│ 2 │ Tomoro │ 15 │ 23 │ 23 │ +│ 3 │ Wonku │ 99 │ 87 │ 93 │ +└────┴───────────┴───────┴───────┴───────┘ +``` + +There are other virtual tricks you can use too like + +``` +SELECT * FROM generate_series(1,100); +``` + +Which you could use in a join, or filter statement. + +Look at `json_tree` and `json_each` functions also. Another virtual table backed feature is Full Text Search 5. + +## Floating Point + +There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision. + +Let's look at the example of floating point precision. + +``` +CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3), (-10.52), (-15.78); +sqlite> .mode box +sqlite> SELECT val FROM floats; +┌────────┐ +│ val │ +├────────┤ +│ 26.3 │ +│ -10.52 │ +│ -15.78 │ +└────────┘ + +sqlite> SELECT sum(val) FROM floats; +┌──────────────────────┐ +│ sum(val) │ +├──────────────────────┤ +│ 1.77635683940025e-15 │ +└──────────────────────┘ +``` + +If you do this on paper `26.3 + -10.52 + -15.78 = 0` it should have come out to zero. + +There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary. + +``` +sqlite> SELECT decimal_sum(val) FROM floats; +┌──────────────────┐ +│ decimal_sum(val) │ +├──────────────────┤ +│ 0.00 │ +└──────────────────┘ +``` + +## Optimizing SQLite + +### Locking + +SQLite can only support one writer process at a time, but it can support many readers at any given time. + +Locking States: +- **Unlocked**: No reading / writing +- **Shared Lock**: Process will acquire a shared lock and all other processes can read the DB too, but no process can write +- **Reserved Lock**: Process announces it is going to write. In the reserved lock state new shared locks can be acquired. +- **Pending**: No new readers can start. We are waiting for all the open shared locks to close out. +- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again. + +### Rollback mode + +Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process. + +### WAL mode (Write Ahead Log) + +The most important thing for SQLite Performance + +``` +pragma journal_mode; +``` + +To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting. + +In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.# High Performance SQLite + +https://highperformancesqlite.com + +``` +brew install sqlite3 +``` + +## Dot Commands + +Be sure to learn the .commands for sqlite. In a session you can see theses with + +``` +.help +``` + +You can also execute these commands from the commandline like so + +``` +sqlite3 your_database.db -cmd ".tables" + +``` + +## PRAGMA Statements + +https://www.sqlite.org/pragma.html + +The PRAGMA statements are sqlite3 specific and used to modify the behavior of the library. + + +To see all the pragmas supported: +``` +sqlite> pragma pragma_list; +``` + +``` +sqlite> pragma page_size; +4096 +``` + +``` +sqlite> pragma page_count; +5 +``` + +Sure enough 4096 * 5 = 20480 bytes as we can see with `ls` + +``` +-rw-r--r-- 1 toor toor 20480 Jul 6 21:28 hellosql.db +``` + +Some of these are connection only, other's are persistent. Example `busy_timeout` will only be non-persistent. + + +An example of an important and persistent setting is the `journal_mode`. This is rollback mode where it will delete the journal when it's done. `WAL` mode is an alternative we will look into. + +``` +sqlite> pragma journal_mode; +delete +``` + +A VERY important one to consider is the `pragma compile_options;` Where you can see how this database was compiled. + +Example, this shows that the brew version was compiled with Full Text Search: + +``` +ENABLE_FTS3 +ENABLE_FTS3_PARENTHESIS +ENABLE_FTS4 +ENABLE_FTS5 +``` + +If you look at the compile options for the MacBook version you will see an OMIT_LOAD_EXTENSIONS flag that means you can't use extensions. + + +Another useful dot command to consider is the .dbconfig but you get less information. +``` +sqlite> .dbconfig + defensive on + dqs_ddl off + dqs_dml off + enable_fkey off + enable_qpsg off + enable_trigger on + enable_view on + fts3_tokenizer off + legacy_alter_table off + legacy_file_format off + load_extension on + no_ckpt_on_close off + reset_database off + reverse_scanorder off + stmt_scanstatus off + trigger_eqp off + trusted_schema off + writable_schema off +``` + + +Another important one is foreign key constraints. Remember they prioritized backwards support over new features, so you need to turn this on. +``` +sqlite> pragma foreign_keys; +0 +``` + +We can simply turn that on with + +``` +pragma foreign_keys=1; +``` + +This is a session level setting. You will need set this at the beginning off every connection. + +Chat Gipity actually can give a good starting point for pragma's to set with reasons for why it would be beneficial. This is a starting point though. You should do your own load testing, and figure out what settings work best for your specific appliction. + +Example "Give me the set of pragma's for sqlite that will benefit performance in service of a web application backend database. + +``` +PRAGMA journal_mode = WAL; +PRAGMA synchronous = NORMAL; +PRAGMA cache_size = -2000; -- 2MB +PRAGMA temp_store = MEMORY; +PRAGMA foreign_keys = ON; +PRAGMA mmap_size = 268435456; -- 256MB +PRAGMA optimize; +PRAGMA busy_timeout = 30000; -- 30 seconds +PRAGMA automatic_index = ON; +PRAGMA locking_mode = EXCLUSIVE; +``` + +Remember to look these up. + +## Loadable extensions + +https://www.sqlite.org/loadext.html +Additional Extensions: https://github.com/nalgeon/sqlean + +You saw above that the Full Text Search extensions were part of the compile options output, thosw are extensions that where compiled into the binary. We can also load extensions if we take on the compilations ourselves Let's follow the instructions to compile and load the csv extension that will let us treat a csv file as a virtual table. + +### Download the extension source. + +A loadable extension obviously can be user code too, but for now we will just stick with the csv extension created by the sqlite team https://www.sqlite.org/src/tree?name=ext. + +``` +git clone https://github.com/sqlite/sqlite && cd sqlite +``` + +Since I am using version 3.46.0 I will checkout that tag and navigate to the source code of the extension. + +``` +git checkout version-3.46.0 +cd ext/misc/ +``` + +Using the example command we see from compilation example + +``` +gcc -g -fPIC -shared YourCode.c -o YourCode.so +``` + +We will compile the csv.so object. + +``` +gcc -g -fPIC -shared csv.c -o csv.so +``` + +Now create a new directory to share this with all users and relocate the object. + +``` +mkdir /usr/local/lib/sqlite3-ext/ +sudo mv csv.so /usr/local/lib/sqlite3-ext/ +``` + +Be sure to add this new directory to your `LD_LIBRARY_PATH` in your `.bashrc` file. + +`export LD_LIBRARY_PATH=/usr/local/lib/sqlite3-ext:$LD_LIBRARY_PATH` + +`source ~/.bashrc` and now you should be able to load this extenstion from the sqlite3 cli. + +``` +sqlite> .load csv +sqlite> +``` + +There are other extensions that add [user authentication to a sqlite database](https://github.com/sqlite/sqlite/blob/master/ext/userauth/user-auth.txt), there is also an extension that can encrypt the file at rest. + +## Virtual Tables + +Allow you to query a table that is not actually in your database. + +Let's take advantage of the csv extension we compiled in the previous step and use that to query a csv file that is actually not part of the database. + + +Just create some data: +``` +cat < ./data/mydata.csv +id,name,level,speed,power +1,Morbinian,20,44,65 +2,Tomoro,15,23,23 +3,Wonku,99,87,93 +EOF +``` + +Now we can create a virtual table using the contents, and query them just like you would expect. + +``` +sqlite> .load csv +sqlite> CREATE VIRTUAL TABLE npcs USING csv(filename="./data/mydata.csv", header=true); +sqlite> .mode box +sqlite> SELECT * FROM npcs; +┌────┬───────────┬───────┬───────┬───────┐ +│ id │ name │ level │ speed │ power │ +├────┼───────────┼───────┼───────┼───────┤ +│ 1 │ Morbinian │ 20 │ 44 │ 65 │ +│ 2 │ Tomoro │ 15 │ 23 │ 23 │ +│ 3 │ Wonku │ 99 │ 87 │ 93 │ +└────┴───────────┴───────┴───────┴───────┘ +``` + +There are other virtual tricks you can use too like + +``` +SELECT * FROM generate_series(1,100); +``` + +Which you could use in a join, or filter statement. + +Look at `json_tree` and `json_each` functions also. Another virtual table backed feature is Full Text Search 5. + +## Floating Point + +There is no Decimal type for SQLite. There is only the REAL data type that represents a floating point number. The real datatype gaurentees 15 digits of precision. + +Let's look at the example of floating point precision. + +``` +CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3sqlite> CREATE TABLE floats (val REAL); +INSERT INTO floats VALUES (26.3), (-10.52), (-15.78); +sqlite> .mode box +sqlite> SELECT val FROM floats; +┌────────┐ +│ val │ +├────────┤ +│ 26.3 │ +│ -10.52 │ +│ -15.78 │ +└────────┘ + +sqlite> SELECT sum(val) FROM floats; +┌──────────────────────┐ +│ sum(val) │ +├──────────────────────┤ +│ 1.77635683940025e-15 │ +└──────────────────────┘ +``` + +If you do this on paper `26.3 + -10.52 + -15.78 = 0` it should have come out to zero. + +There is a function that could have helped us with this. This decimal_sum comes from a decimal extension. NOTE this extension ships in the cli, but not built into the actual sqlite binary. + +``` +sqlite> SELECT decimal_sum(val) FROM floats; +┌──────────────────┐ +│ decimal_sum(val) │ +├──────────────────┤ +│ 0.00 │ +└──────────────────┘ +``` + +## Optimizing SQLite + +### Locking + +SQLite can only support one writer process at a time, but it can support many readers at any given time. + +Locking States: +- **Unlocked**: No reading / writing +- **Shared Lock**: Process will acquire a shared lock and all other processes can read the DB too, but no process can write +- **Reserved Lock**: Process announces it is going to write. In the reserved lock state new shared locks can be acquired. +- **Pending**: No new readers can start. We are waiting for all the open shared locks to close out. +- ** Exclusive Lock**: Process writes as fast as it can to the DB then gives up the lock allowing for other processes to acquire shared locks again. + +### Rollback mode + +Before the db goes to write something it takes the page being modified, sets aside, then when the new page is successfully written to disk it will disregard the previous page. The act of throughing that data away is what completes the commit process. + +### WAL mode (Write Ahead Log) + +The most important thing for SQLite Performance + +``` +pragma journal_mode; +``` + +To view the current mode. By default it's delete. This IS a persistent setting across sessions. So set once and all subsequent connections will also have this setting. + +In wall mode a new file contains the new writes. At some point a checkpoint will occur, and the WAL log is taken and merged with the db, and the process happens again. Checkpointing is configurable. It's a whole lot faster. We still have a single writer but can have many concurrent readers still. The difference is that the reader will have knowledge of WAL transactions up to a specific point. In this mode writer doesn't block the reader anymore.