Python Issue

the_Grinchthe_Grinch Member Posts: 4,165 ■■■■■■■■■■
Hoping someone can help me out because for the life of me I cannot figure out why I'm getting different results.

I want to compare two files that should be exactly the same. In linux I run the following command:

diff file1 file2

It turns me to the command line as there is no difference.

Now when I run the same command from Python:

results = os.system("diff caesars_terms.txt caesars_termsa.txt")

It says there is a difference. To really screw things up, if I break that command out from the entire script it works fine.

Script below:

#!/usr/bin/python
from urllib import urlopen
import nltk
import difflib
import os

#Caesar's URL for Terms and Conditions
url = "https://www.caesarscasino.com/en/policies/terms-conditions"
#Opens the URL and reads the HTML
html = urlopen(url).read()
#Cleans the HTML from the file providing only text
raw = nltk.clean_html(html)
#Creates a file and stores the value in said file
f = open("caesars_termsa.txt","w")
f.write(raw)
#Compares the two different files
results = os.system("diff caesars_terms.txt caesars_termsa.txt")
#Displays a passed or failed message as a result of the comparisons of the files
if not int(results):
print "PASSED"
else:
print "FAILED"
WIP:
PHP
Kotlin
Intro to Discrete Math
Programming Languages
Work stuff

Comments

  • MrAgentMrAgent Member Posts: 1,310 ■■■■■■■■□□
    Was this written for a different version of python than you are using?
    For instance 2.7 vs 3+
  • the_Grinchthe_Grinch Member Posts: 4,165 ■■■■■■■■■■
    According to me, should all be 2.7. That being said, I have been piecing things together so that very well could be the issue. I actually found a work around by breaking things down into modules and running them one at a time. Makes a lot of moving parts, but got everything working correctly.
    WIP:
    PHP
    Kotlin
    Intro to Discrete Math
    Programming Languages
    Work stuff
  • YFZbluYFZblu Member Posts: 1,462 ■■■■■■■■□□
    After f.write, close the file with f.close() on the next line.

    While Python does close the file for you autotmagically even without using the close() method, it's leaving a newline at the bottom of the page - the discrepancy between the files. The close method should remove it and your script should work.

    Also, have you looked into the Requests module as a replacement for urllib? Check it out, it's a much more elegant API imo.

    Edit: Confirmed success on my laptop - It works
  • YFZbluYFZblu Member Posts: 1,462 ■■■■■■■■□□
    MrAgent wrote: »
    Was this written for a different version of python than you are using?
    For instance 2.7 vs 3+

    Everything there will work in 2.7, which is what the OP is using.
  • the_Grinchthe_Grinch Member Posts: 4,165 ■■■■■■■■■■
    YFZblu you are a genius! Didn't even dawn on me to use the close (which I saw it just about every script I looked at!) I do need to dig into the Requests module a bit more.
    WIP:
    PHP
    Kotlin
    Intro to Discrete Math
    Programming Languages
    Work stuff
  • YFZbluYFZblu Member Posts: 1,462 ■■■■■■■■□□
    Glad I could help - It's nice to see some Python projects being posted
  • the_Grinchthe_Grinch Member Posts: 4,165 ■■■■■■■■■■
    Yeah, python seemed like the way to go for this project (it's going to be pretty extensive, but will save us a ton of time once completed). You also helped me out in another way because my original plan was to do a sha1 of the terms and if that failed then check line by line to see where the difference was. The problem was it kept failing because the checksum was difference. You pointed out the extra line added and then I realized that was way it was failing. Thanks again!!!
    WIP:
    PHP
    Kotlin
    Intro to Discrete Math
    Programming Languages
    Work stuff
Sign In or Register to comment.