are you storing your data in a database yet do you have information that you
want to make persistent maybe for an app or a web scraper you're not sure where
to start well i'll tell you use sqlite in python we have access to sqlite3
which makes this the easiest and most simplest way to make your data
persistent in a database in this video i'm going to show you how you can create
connect to a database create a table and then store some data in it and at the
end i'll show you a real world example where i have a web scraper hooked up
that only stores the data in the database if it doesn't exist already the
first thing that we need to do is we need to import sqlite3 and this is in
call python so we don't need to pip install anything it's already there for
us the next thing that we need to do is we need to set up a connection to our
database so i'm going to do con connection it's equal to sqlite3 dot
connect and then the name of the database that we want to connect to i'm
just going to call this one example.db now what this line will do is if there
is no database that exists with this name in this
folder it's going to create it if it does exist it's going to connect to it
so the next thing we want to do is we need to set up a cursor so what this
will do is this will let us actually execute commands on the database so now
we have both of these we can start to create our table in our database to do
that we're going to use the cursor and we're going to use execute now this lets
us execute sql queries onto our database throughout python code now to do that we
need to use the triple quote marks and we need to start writing our sql query
and i'm going to do create table if not exists
all in capitals and this does exactly what it looks like it should do it's
going to create this table if it doesn't exist already so let's do t-shirts
because we're going to call our table t-shirts and now we need to define what
columns we want to have in our table
to do that we need to create some brackets and then we need to write our
column headers so i'm going to say skew it's text
so the first one is the column header the name of the column and then the
second is the actual data type you can use quite a few different data types in
this one we're going to be using text and numbers which are real the next is
going to be name which again is text and then let's just put a size for text
as well and then the price which is real which is again as i said floating point
number i'm going to run this and what we're going to do is we'll see no output
but we are going to have created this example.db file
in our project folder here i don't think i can make this any bigger there you go
you see it over there now what we're going to do is we want to actually add
some data we're going to use our cursor.execute again
and inside here we're going to write our next sql query which is going to be
insert into t-shirts which is our table name and
then the word values and now we're going to actually type out the values that are
going to go into this database now this needs to match this exactly because
these are the columns that we have created in this table so the first bit
of data that we're going to do is the sku i'm just going to make that up
and then i think it's the name size and the price that's going to
execute to insert this data into this table here so if we run this you might
expect this to actually put that data into the database but it won't because
we are missing one one extra line of code that we need and
we need to commit these changes so we do con
dot commit what this means is you can execute or
execute many and then do con dot commit at the end to commit all your changes in
one go so now that we'll run this and we'll commit these changes what we're
going to see is that that data has actually been put into the database now
there's two ways we can check this out we can write some python code to select
data from this database and then display it which we're going to do and we also
could use a database program i'll show you that as well in just a minute so
what i'm going to do is i'm going to do four row
in cur dot execute and then our sql query select star
from t-shirts this is basically saying select
everything from this table you generally wouldn't use this sort of a query with
the star you'd be more specific as to the data that you wanted however in this
example we are going to be fetching everything just to keep it simple
and then we can print row so far we've connected to the database we've created
our cursor we've executed a table if not exist so this means that when it finds
this table it'll just move on and we're inserting this data committing it and
then pulling it back out again so we can see that we have two instances
which is the same bits of data i'm going to run it again we've got three four
five etc so on now this is not ideal this is not
exactly what we want because all we're doing is just inserting this data again
and again because we haven't used a primary key now when we set one of these
fields up as a primary key you can only have one instance of that value in that
database so what i'm going to do is i'm going to delete our database and what
i'm going to do is i'm going to change the sku to the primary
key so now when i run this we get one lot of data back here
that's great and if i run it again we get an error because unique constraint
failed so basically it's saying if you can't add this information again because
your primary key already exists now we could change this let's say this is a
different product let's do medium large and run that there we go we have
both instances in that data but what we want to do is we want to be able to
try to insert this data and ignore it if it already exists because of the primary
key now you can have your primary key as any bit of your information if you like
but the easiest way to do that is to do insert
or ignore
so what this is all this is going to do is that if it can't insert this because
the primary key already exists it's going to ignore it and move on there we
go so we can run this over and over again no data is being added
change the primary key there we go extra data has been added so here we have a
real simple basic web scraper that takes the
scraping sandbox i'm doing some data cleaning here this is just returning the
soup data and we're passing out this information here now if you've followed
my videos before you'll know that i like to return a dictionary but in this case
i'm returning a tuple because it's much easier that way to put the data into our
database now you can execute and add dictionary information to a database but
if you think about it the database already has column headers whereas in
the dictionary you would have keys now those neces are two types of the same
sort of thing if you see what i mean so we don't need to do them so all i'm
doing is collecting the data from here and then we are returning our list
of tuples so we can see that we have our book list here so what i'm going to do
is i'm just going to comment these out for the moment and i'm going to print
the book list
at here so and i'll comment out the uh i know we'll leave that in now so when
we run this you'll see that we get all the book data back here uh it's a list
of two pools so we want to add this to our database
so all i'm doing is adding execute many so i showed you
execute before now we're doing execute many and to add a list of tuples to our
database again i'm using insert or ignore books is our table the values and
we have three bits of information we have the title price and the stock so we
have these three question marks here and then our book list what this is going to
do is it's going to run through this and it's going to add each
tuple from this book list into this table so i hit run
we've got our print statement here and it's finished now because i have insert
or ignore and then we are basically just scraping one page here if i run this
again we get nothing
we will see the output again the print statement but we don't get any errors
so what i'm going to do now is i'm going to show you the
database browser and we're going to open this up
and we'll have a look at this actual database here so i'm using db browser
for sqlite this works in linux windows and mac as well i believe and we can see
i've opened up the database and we have this table here so if i go to browse
data here is the information these are the 20 products on that page that we
have got the data for so no matter how many times we run this
we're not going to add any more information it's only going to be these
bits that stay in here because i set at the top the title text to primary key so
that means the name of the book the text is the primary key that cannot be added
again and because we are again doing insert or ignore you don't get that in
data added any more if you found this useful go ahead and check out this video
here is more information on web scraping and saving data