when you start to run your code more often or on a server you want to have
logging in there somewhere so you can see what's happening and what's going on
with it when you're not watching it in this video I'm going to show you two
different ways you can do that with this web scraping example here the first one
is using print and now I've talked about this before and I've advised against it
but it does have its place and the second one is using logging now they are
both going to be using the event hooks within our HTTP client because that is
what we want to actually log the requests that that scraper is making to
the server so I'm using https but this absolutely works with requests as well
they have exactly the same thing works in very similar way so if we look at the
event hooks page here on the httpx documentation it tells us that we can
add in these hooks for a request and response and we can then decide what we
do and it has some information within the request you actually get access to
the URL Etc and the response has everything apart from the body I think
which you would need to call read but that's fine because we're not going to
be logging the body we literally just want to see when the requests were being
made and if they were successful so it's nice and easy to do this in fact
I'm actually just going to copy their example and we'll put it into our code
so let's come down to our client down here and let's put this in we need to
indent this like this and we now and add this here so we now have our client set
up here with the event hooks added in let's just give ourselves a little bit
of extra space there so we can see what's happening so every time we make a
request our code is going to print out this information here which is basically
going to say what method was being used and the URL and when we get it back we
get this response here which says the method the URL and the status code this
is exactly the information that we want you can see how easy it is to add this
into your program so I'm going to save this and we're going to run this code
now I'm just going to double check that I haven't got too much print out coming
here I'm going to remove some of these print statements because we are not
going to be needing the actual data that's coming back let's clear this and
let's run this code python 4 doesn't exist main.pi here we go so we can see
all of the responses I'm going to close close that and we'll go back up we can
see that we're getting the pages so we have this request event hook get that
says hey we're making a request of this page and here status 200. this is the
information that we want but there is one piece of information that is missing
and that's the date so let's make sure that is in there because otherwise we're
going to have no idea when this and when any of this happened
so we need to have date time in so I'm going to do from date time we're going
to import date time like this this will allow us to create a date time object in
Python so let's come back down to our main here and I'm going to add in time
is equal to date time dot I think it is now this will allow us here so what
we're going to do is we're just going to add this in to start with we'll have
time like this and something like that this should work fine
time okay save and close let's run it again so now we have a time and date
object when this is all happening so this is all well and good you're gonna
say but this is just printing to the screen what do we actually do with this
I always run my code on a Linux machine whether it's on a digital ocean droplet
which is a Ubuntu server or Debian server depending what you chose so what
we can do is we can just output using the double arrows to our log dot
are out DOT log file like this what this is going to do is this is going to
Output all of the standard out into a log file for us which will then be able
to investigate if we need to so I'm going
to run this I'm going to wait for a couple of seconds I'm going to close it
and then I'm going to cap my log file and here is all the information let's
open that in neovim and here we have everything there so you can see the time
the response etc etc so when you're running this on your
server in your Cron job instead of just having the run with your main.pi you'd
include the double arrows and the out DOT log here too
so this is good I like this this is a very simple way of getting it and it's
very easy to add in so it doesn't take up much time or effort now the other
thing we can do is actually use the python logging module which is going to
give us access to warnings and information so it'll actually have
different levels and this can be highly configurable now I have not dived into
this nearly as far as I suspect some people have however for what I do I
generally don't tend to use it too much anymore I just need to see some basic
information but it is really powerful and you can have all sorts of custom
bits in so we're going to have a really simple logger here if we look down here
you'll see that we have format basic config before we go any further though I
want to talk about actually the file so you can of course have python save the
file for you and do it all that way instead of doing it via the output like
I did in the other example this is all well and good and you
absolutely can do this I just find that maybe having an extra thing that your
code needs to do is not that useful we can just output it using standard out
and then we have no errors because our operating system will always have that
standard out that we can use and when we print we can choose what exactly goes
out there maybe we want to add a couple more things in or maybe you want to
include a lot more information you can print that all out let's go ahead and
remove our print logging from here and add in the actual logging module so
we'll do import logging so let's set up our basic config so
we'll do logging dot basic config and I think we want format it's going to be
equal to over here so let's do the time paste you in there
and let's have the um
level name and then the message which I think is
just message there we go like this so this is going to give us
access to this logging information and now instead of our print statements down
here in our log requests we can do logging Dot info
let's have this as an info request and this should give us the information in
but we want to add in the request.url as well so let's just put in
our request dot URL
and let's comment out this for the moment logging dot info
and again we'll have um request status response
Dot status code
we need to add in the level here so I'm going to put this out of all these
logging dot info and this should give us that out to the screen now format this
should be a capital once we added in that uh the level so we
actually log this thing you can see that we're basically getting the same
information which is what we asked for the URL and then the response with the
uh B response status code here is
essentially the same information if you wanted to do it this way this much this
way is obviously much better if you're planning on building out or expanding on
this but for very small code projects like this one you're just running every
now and again it's probably fine to just do the print statements so let's just
add in a file here as well which I don't tend to do as I explained just a minute
ago we'll say um
scraper DOT log like so add our format back in format
save this and now when we run this code it should all be logged to our scraper
log file okay stop that and let's just do envym
scraper log and here is all the information that we logged just now so
what do you think is it worth using the actual built-in logging like I'm doing
here do you log to file do you print the log do you then use the Linux system the
server to save it into a file for you let me know what you do I do both print
logging is sometimes just quicker and easier but obviously the actual logging
module is much more powerful if you want to see how I wrote this scraper code so
you can adapt it and amend it to websites that you want to grab the data
from you're going to want to watch this video right here next