Video Thumbnail 09:16
2 Ways to add Logging to your Python Code
5.1K
186
2023-07-30
➡ JOIN MY MAILING LIST https://johnwr.com ➡ COMMUNITY https://discord.gg/C4J2uckpbR ➡ PROXIES https://proxyscrape.com/?ref=jhnwr ➡ WEB SCRAPING API https://hubs.li/Q043T88w0 ➡ HOSTING https://m.do.co/c/c7c90f161ff6 If you are new, welcome. I'm John, a self taught Python developer and content creator, working at Zyte. I specialize in data extraction and automation. If you like programming and web content as much as I do, you can subscribe for weekly content. All views in this video are my o...
Subtitles

when you start to run your code more often or on a server you want to have

logging in there somewhere so you can see what's happening and what's going on

with it when you're not watching it in this video I'm going to show you two

different ways you can do that with this web scraping example here the first one

is using print and now I've talked about this before and I've advised against it

but it does have its place and the second one is using logging now they are

both going to be using the event hooks within our HTTP client because that is

what we want to actually log the requests that that scraper is making to

the server so I'm using https but this absolutely works with requests as well

they have exactly the same thing works in very similar way so if we look at the

event hooks page here on the httpx documentation it tells us that we can

add in these hooks for a request and response and we can then decide what we

do and it has some information within the request you actually get access to

the URL Etc and the response has everything apart from the body I think

which you would need to call read but that's fine because we're not going to

be logging the body we literally just want to see when the requests were being

made and if they were successful so it's nice and easy to do this in fact

I'm actually just going to copy their example and we'll put it into our code

so let's come down to our client down here and let's put this in we need to

indent this like this and we now and add this here so we now have our client set

up here with the event hooks added in let's just give ourselves a little bit

of extra space there so we can see what's happening so every time we make a

request our code is going to print out this information here which is basically

going to say what method was being used and the URL and when we get it back we

get this response here which says the method the URL and the status code this

is exactly the information that we want you can see how easy it is to add this

into your program so I'm going to save this and we're going to run this code

now I'm just going to double check that I haven't got too much print out coming

here I'm going to remove some of these print statements because we are not

going to be needing the actual data that's coming back let's clear this and

let's run this code python 4 doesn't exist main.pi here we go so we can see

all of the responses I'm going to close close that and we'll go back up we can

see that we're getting the pages so we have this request event hook get that

says hey we're making a request of this page and here status 200. this is the

information that we want but there is one piece of information that is missing

and that's the date so let's make sure that is in there because otherwise we're

going to have no idea when this and when any of this happened

so we need to have date time in so I'm going to do from date time we're going

to import date time like this this will allow us to create a date time object in

Python so let's come back down to our main here and I'm going to add in time

is equal to date time dot I think it is now this will allow us here so what

we're going to do is we're just going to add this in to start with we'll have

time like this and something like that this should work fine

time okay save and close let's run it again so now we have a time and date

object when this is all happening so this is all well and good you're gonna

say but this is just printing to the screen what do we actually do with this

I always run my code on a Linux machine whether it's on a digital ocean droplet

which is a Ubuntu server or Debian server depending what you chose so what

we can do is we can just output using the double arrows to our log dot

are out DOT log file like this what this is going to do is this is going to

Output all of the standard out into a log file for us which will then be able

to investigate if we need to so I'm going

to run this I'm going to wait for a couple of seconds I'm going to close it

and then I'm going to cap my log file and here is all the information let's

open that in neovim and here we have everything there so you can see the time

the response etc etc so when you're running this on your

server in your Cron job instead of just having the run with your main.pi you'd

include the double arrows and the out DOT log here too

so this is good I like this this is a very simple way of getting it and it's

very easy to add in so it doesn't take up much time or effort now the other

thing we can do is actually use the python logging module which is going to

give us access to warnings and information so it'll actually have

different levels and this can be highly configurable now I have not dived into

this nearly as far as I suspect some people have however for what I do I

generally don't tend to use it too much anymore I just need to see some basic

information but it is really powerful and you can have all sorts of custom

bits in so we're going to have a really simple logger here if we look down here

you'll see that we have format basic config before we go any further though I

want to talk about actually the file so you can of course have python save the

file for you and do it all that way instead of doing it via the output like

I did in the other example this is all well and good and you

absolutely can do this I just find that maybe having an extra thing that your

code needs to do is not that useful we can just output it using standard out

and then we have no errors because our operating system will always have that

standard out that we can use and when we print we can choose what exactly goes

out there maybe we want to add a couple more things in or maybe you want to

include a lot more information you can print that all out let's go ahead and

remove our print logging from here and add in the actual logging module so

we'll do import logging so let's set up our basic config so

we'll do logging dot basic config and I think we want format it's going to be

equal to over here so let's do the time paste you in there

and let's have the um

level name and then the message which I think is

just message there we go like this so this is going to give us

access to this logging information and now instead of our print statements down

here in our log requests we can do logging Dot info

let's have this as an info request and this should give us the information in

but we want to add in the request.url as well so let's just put in

our request dot URL

and let's comment out this for the moment logging dot info

and again we'll have um request status response

Dot status code

we need to add in the level here so I'm going to put this out of all these

logging dot info and this should give us that out to the screen now format this

should be a capital once we added in that uh the level so we

actually log this thing you can see that we're basically getting the same

information which is what we asked for the URL and then the response with the

uh B response status code here is

essentially the same information if you wanted to do it this way this much this

way is obviously much better if you're planning on building out or expanding on

this but for very small code projects like this one you're just running every

now and again it's probably fine to just do the print statements so let's just

add in a file here as well which I don't tend to do as I explained just a minute

ago we'll say um

scraper DOT log like so add our format back in format

save this and now when we run this code it should all be logged to our scraper

log file okay stop that and let's just do envym

scraper log and here is all the information that we logged just now so

what do you think is it worth using the actual built-in logging like I'm doing

here do you log to file do you print the log do you then use the Linux system the

server to save it into a file for you let me know what you do I do both print

logging is sometimes just quicker and easier but obviously the actual logging

module is much more powerful if you want to see how I wrote this scraper code so

you can adapt it and amend it to websites that you want to grab the data

from you're going to want to watch this video right here next