HBH Python Competition
Hey guys, I've noticed there's been a lot of inactivity here on HBH lately, so I got Mr_Cheese to approve a new code optimization competition.
The Rules:
- The script can be edited as much as you want between the "Begin Optimization" and "End Optimization" comments.
- The script must take input from a text file. Don't assume I'll use the same data when I test your code.
- If two identical scripts are turned in, the first delivered will be the one participating.
- Everyone can participate (even administrators). I won't be participating since I wrote the script.
- The winner will be the person with the least execution time on the script.
- The scripts will tested multiple times and I will take the average time.
Prizes: 1st - 100 challenge points. 2nd - 75 challenge points. 3rd - 50 challenge points. All winners will also have a cool little recognition stated on their profiles.
All submissions must be in by November 6th.
If anything here is unclear, feel free to ask me via pm, or here in this thread. This thread is for questions concerning this topic only.
The script reads a long string of numbers from a file, and then counts the number of ones, twos, threes, etc. It then puts those into a list and determines if there are any duplicates (e.g. if there are 10 ones and 10 twos). Then the program prints the results.
Here is the code:
t=time.time()
################################
#########BEGIN OPTIMIZATION########
################################
file=open('input.txt', 'r')
str=file.read()
zeroes=ones=twos=threes=fours=fives=sixes=sevens=eights=nines=0
repeats=False
#Determines how many of each number there are in the string
for x in range(0, len(str)):
if(str[x]=='0'):
zeroes+=1
if(str[x]=='1'):
ones+=1
if(str[x]=='2'):
twos+=1
if(str[x]=='3'):
threes+=1
if(str[x]=='4'):
fours+=1
if(str[x]=='5'):
fives+=1
if(str[x]=='6'):
sixes+=1
if(str[x]=='7'):
sevens+=1
if(str[x]=='8'):
eights+=1
if(str[x]=='9'):
nines+=1
counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]
#Determines if there are any duplicates in the array of counts
for x in range(0, len(counts)):
for y in range(0, len(counts)):
if(x==y): continue
if(counts[x]==counts[y]): repeats=True
#Print results
for x in range(0, len(counts)):
print "There are "+repr(counts[x])+" "+repr(x)+"'s"
if(repeats==True):
print "There was at least one repeat"
else:
print "There were no repeats"
################################
########END OPTIMIZATION########
################################
#calculate and show time usage
t2=time.time()
print "Time elapsed: "+repr(1000*(t2-t)) #microseconds
And here is the data (input.txt):
The output should look like:
There are 1978 1's
There are 1891 2's
There are 1621 3's
There are 1864 4's
There are 2020 5's
There are 1503 6's
There are 1837 7's
There are 1549 8's
There are 1078 9's
There were no repeats```
**Current Rankings:**
They're getting pretty close. I'm going to post a few extra people in the rankings:
Me - 4.7528489854350002
COM - 4.7472755556899999
stdio - 4.7471755556899999
just a panda - 4.7315940047550002
otani - 4.58821655726
Defience - 4.3426556233060003
Original Script - 1
The rankings are measured in "x times speedup".
Note, I'm not in the competition, I just put mine into the ranking for comparison.
stealth- wrote: I'm assuming we only get one submission? Say, for instance that I hand mine in because I didn't want someone to hand there's in before me and have the same code, but then later I find out because I was rushing I missed a huge optimisation. Could I hand in a replacement? How about I use whichever version gives the best time?
And i'm running python 2.5. That should be fine with any 2.X since there shouldn't be any need for extra libraries. Python 3 sucks.
I assume by "repeats" you mean two identical numbers occur after eachother in the string?
ynori7 wrote: How about I use whichever version gives the best time?
And i'm running python 2.5. That should be fine with any 2.X since there shouldn't be any need for extra libraries. Python 3 sucks. My first tests on the unoptimized code shows me that python3 runs the script in half the time python2.6 does it.
ctrl_ wrote: I assume by "repeats" you mean two identical numbers occur after eachother in the string? Say you have the list(1, 2, 3, 4, 3). In that list, 3 occurs twice. That is a repeat. I suppose "duplicates" would have been a better word.
My first tests on the unoptimized code shows me that python3 runs the script in half the time python2.6 does it. It doesn't matter, they'll all be tested on the same machine with the same version of python.
ynori7 wrote: [quote]ctrl_ wrote: I assume by "repeats" you mean two identical numbers occur after eachother in the string? Say you have the list(1, 2, 3, 4, 3). In that list, 3 occurs twice. That is a repeat. I suppose "duplicates" would have been a better word. [/quote] This is where I'm getting confused. By this definition, wouldn't counting the number of occurrences be the same thing as counting how many "duplicates/repeats" there are?
No no. Alright, say the string you read from the file is this "1234512". Now you total up the number of occurrences of each number: 2 ones 2 twos 1 three 1 four 1 five Now, those are stored into a list like this [2, 2, 1, 1, 1]. You now have duplicates in that list. There are two 2's and three 1's. That is what the second part of the algorithm checks.
Hey I had a suggestion, though it does mean slightly more work for you Ynori7.
Looking at the past php competition for example, everyone submitted their code and at the end winners were announced. This doesn't seem to competitive, or at least I would like to see it more so.
Would it be possible to update the times of the top 3 on a weekly basis? You gave us 1 month to rewrite code that takes about ~20min. Knowing the current standings and times to beat, would be more incentive for both myself and others who submit optimizations to rethink the way they did things and in the end, we will end up with better code.
stdio wrote: Would it be possible to update the times of the top 3 on a weekly basis? You gave us 1 month to rewrite code that takes about ~20min. Knowing the current standings and times to beat, would be more incentive for both myself and others who submit optimizations to rethink the way they did things and in the end, we will end up with better code. Well, posting the times wont all that helpful to people since times on my computer will be different than times on yours, but I could update the post every so often to say who the three current leaders are.
EDIT: Done. The current rankings are at the bottom of the original post.
stdio wrote: Hey I had a suggestion, though it does mean slightly more work for you Ynori7.
Looking at the past php competition for example, everyone submitted their code and at the end winners were announced. This doesn't seem to competitive, or at least I would like to see it more so.
Would it be possible to update the times of the top 3 on a weekly basis? You gave us 1 month to rewrite code that takes about ~20min. Knowing the current standings and times to beat, would be more incentive for both myself and others who submit optimizations to rethink the way they did things and in the end, we will end up with better code.
That was a good idea, now I can see I definitely need to work on mine. :)
Im going to try my best to explain why this challenge is skewed. Bare with me and understand all times reported are off of my computer. Though I will relate them back to Ynori7's First lets look at the original code
import time
t=time.time()
################################
#######BEGING OPTIMIZATION######
################################
file=open('input.txt', 'r')
str=file.read()
zeroes=ones=twos=threes=fours=fives=sixes=sevens=eights=nines=0
repeats=False
#Determines how many of each number there are in the string
for x in range(0, len(str)):
if(str[x]=='0'):
zeroes+=1
if(str[x]=='1'):
ones+=1
if(str[x]=='2'):
twos+=1
if(str[x]=='3'):
threes+=1
if(str[x]=='4'):
fours+=1
if(str[x]=='5'):
fives+=1
if(str[x]=='6'):
sixes+=1
if(str[x]=='7'):
sevens+=1
if(str[x]=='8'):
eights+=1
if(str[x]=='9'):
nines+=1
counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]
#Determines if there are any duplicates in the array of counts
for x in range(0, len(counts)):
for y in range(0, len(counts)):
if(x==y): continue
if(counts[x]==counts[y]): repeats=True
#Print results
for x in range(0, len(counts)):
print "There are "+repr(counts[x])+" "+repr(x)+"'s"
if(repeats==True):
print "There was at least one repeat"
else:
print "There were no repeats"
################################
########END OPTIMIZATION########
################################
#calculate and show time usage
t2=time.time()
print "Time elapsed: "+repr(1000*(t2-t)) #microseconds
I ran this 10 times with an avg t of about ~35us I then commented out the print lines and got the algorithm to run at an avg time of t 9.9us. This tells me that the algorithm can be improved.
so I redid the program (and yes Im giving out my answer to support my tests)
Edited for sake of competion
Again I ran it 10 times with an avg of t of 20us. I then choose to comment out the print lines and redo it. Now the avg time was too small to detect so giving me back a t of 0.0. So I ran the algorithm without the print in a loop a 1000 times. This time I was getting an avg time of t=399us. Which suggests that the algorithm part and the part that we are allowed to modify is running at .4us.
Now total I was at 20us on my machine and 19.6us of that is soley in the print function. Which is MANDATORY TO HAVE. Thats 98% of the time is out of my control.
Now lets look at the results, I ran at 42us and COM ran his at 20us. This suggests that either there is not the same print output being displayed, or more likely there is noise filling the data. By noise I mean things beyond our control.
To get a 50% reduction, the only thing I can change is the print function, but Im not allowed to do it. So really its just a matter of when was the computer restarted, shit to do with pythons garbage collection, lucky ass timing with cpus and a host of other shit thats out of control.
Just to test this again I ran the following.
import time
t=time.time()
################################
#######BEGING OPTIMIZATION######
################################
for i in range(0,11):
print i
################################
########END OPTIMIZATION########
################################
#calculate and show time usage
t2=time.time()
print "Time elapsed: "+repr(1000*(t2-t)) #microseconds
This code is just printing 11 lines. Thats it and it runs at 20us on my machine. and my code above runs at 20us they are indistinguishable.
In conclusion, I have rewritten the algorithm from ~9ish us to less than 1us and the print function is fucking everything up. Unless you force the algorithm part (the part we can edit) to loop enough that the print timing becomes statistically insignificant, this challenge is bullshit.
Note: Im not saying my program is the best, Im saying the print function is fucking everything up to the point where I cant decipher how to improve it.
Also: This isnt too belittle the challenge at all or ynori7, just applying data that I ran to show that this is statistically impossible to actually optimize beyond a fraction of a microsecond and have the print fuck it up.
Edit: Also when I output to a text file and not use the print function, it again runs at less then 1us. Which is further support that I cant improve much beyond changing the print. Feel free to copy my code and test it for yourself.
So what, you want to be able to remove the print statements to make it faster? Too bad. That's how life works. When you're in the industry and you're told to make your program run faster you can't just make it stop doing some of the things it's supposed to do.
Now, since the program HAS been optimized further than what you've done, you must be overlooking some things.
EDIT: and if you could remove your code from that post, that'd be nice. Don't want other people getting ideas from it.
ynori7 wrote: So what, you want to be able to remove the print statements to make it faster? Too bad. That's how life works. When you're in the industry and you're told to make your program run faster you can't just make it stop doing some of the things it's supposed to do.
Now, since the program HAS been optimized further that what you've done, you must be overlooking some things.
At least back up your talk with data. I CANT, ITS STATISTICALLY IMPOSSIBLE. Though please, feel free to PROVE me wrong.
The code is not using enough resources. Might I suggest altering the code, so that;
- it doesn't have any print statements
- people have to optimize some code that uses up far more processing cycles (don't use loops to make it more intensive)
I really hope this competition is going to be a success. Best of luck to all involved.
stdio wrote: [quote]COM wrote: stdio, when you tested that yourself, did you run it on windows?
yes[/quote] Ah, then I understand. When I tested mine on windows, the resulting times were very inaccurate and fucked up. Once tested on linux, the times are more precise and actually comparable.
Edit: So you know what I mean, on windows this is what I get: alternating times between 15 and 16 or just 0.0
On linux, I get very precis times, which for me are different from what ynori gets (mine shows between 0.4 and 1.1), but at least there it's comparable the way that it shows a more specific answer.
COM wrote: [quote]stdio wrote: [quote]COM wrote: stdio, when you tested that yourself, did you run it on windows?
yes[/quote] Ah, then I understand. When I tested mine on windows, the resulting times were very inaccurate and fucked up. Once tested on linux, the times are more precise and actually comparable.
Edit: So you know what I mean, on windows this is what I get: alternating times between 15 and 16 or just 0.0
On linux, I get very precis times, which for me are different from what ynori gets (mine shows between 0.4 and 1.1), but at least there it's comparable the way that it shows a more specific answer.[/quote]
A response I will buy, I was getting around 2-3us on mine in linux. This at least allows me to more accurately track my program. To the point where its not impossible to tell the difference. Though I still want to reiterate what spy said and that this program doesn't use enough cpu cycles. Also if it is being tested in windows, then the output times will obviously be far more biased on luck.
All that said at least I have a place to start to rethink my solution.
f16e7 wrote: stdio, I don't think you should be handing out these instructions, it's an individual task, remember?
True but it supports that just because something may be a fraction of a microsecond faster, doesnt make it the best possible code. If I wanted timing to be the best I would rewrite the damn thing in assembly. People rarely program in this anymore because its an inefficient use of resources (peoples time), not because it isnt the fastest means of execution.
stdio wrote: True but it supports that just because something may be a fraction of a microsecond faster, doesnt make it the best possible code. If I wanted timing to be the best I would rewrite the damn thing in assembly. People rarely program in this anymore because its an inefficient use of resources (peoples time), not because it isnt the fastest means of execution. Then you can tell people generally that a faster design doesn't necessarily mean the best design. No need to tell them how to go about doing it.
I just changed how I'm testing the scripts to get a more accurate reading. I'm now running the original script along with each of your scripts (to account for ups and downs in my processor) and I then measure the speedup compared to the results from the original script.
The rankings are updated now with the new comparison method and new submissions.
I wasn't paying attention, where am I sending my solution?
I'm tempted to just post it up. It probably won't win (in case you guys haven't caught on python favours readable code over fast code).
Also, you never mentioned if you've got psycho installed, because if you do the first smartass to
try:
import psycho
psycho.full()
Will win :P
In any case I've finished with my entry, it's not too bad.
bdafae wrote: 10 runs? Holy shit. I'd rather you used 100 runs, or even better, a thousand. Why the small amount of executions? Because the way python manages its memory, if I automate the executions it get's 0 for the runtime about 80% of the time. I tried to figure a way to flush the memory after each execution, but it didn't seem to work.
EDIT: And I do give everyone the benefit of the doubt. If I start getting numbers that look wrong (way too fast or way too slow) I throw the results out and try again. I also run the script far more than 10 times and briefly scan over the results to make sure they're consistent.
I'm trying to rack my brains to remember a python optimization tool that checked run speed through out the script running to identify bottlenecks that might be useful in this circumstance?!
I'm not involved in this I know, but I agree 10 times is way to small to try and identify such a small gap in the run times. I know it sounds like a lot of work, but hoe about posting a range along side the average?
buttmonkey wrote: I'm trying to rack my brains to remember a python optimization tool that checked run speed through out the script running to identify bottlenecks that might be useful in this circumstance?! I believe Psyco does this.
When I was testing my code, I wrote a simple shell script that ran my script and the original, one after the other (with a pause between) and wrote the execution times of each to it's own file. Then I simply wrote a Python script to show the averages of each script. Just an idea. ;)
ynori7 wrote: [quote]bdafae wrote: 10 runs? Holy shit. I'd rather you used 100 runs, or even better, a thousand. Why the small amount of executions? Because the way python manages its memory, if I automate the executions it get's 0 for the runtime about 80% of the time. I tried to figure a way to flush the memory after each execution, but it didn't seem to work.
EDIT: And I do give everyone the benefit of the doubt. If I start getting numbers that look wrong (way too fast or way too slow) I throw the results out and try again. I also run the script far more than 10 times and briefly scan over the results to make sure they're consistent.[/quote]
The easiest way to avoid this is to use a bigger dataset. At the moment the different between competitors is so small that it's statistically insignificant.
With an input.txt file containing 10mb of mixed alphanumeric characters the difference becomes quite noticeable, oh and it's statistically significant.
As an example, on a 1mb file a small tweak dropped my result from ~1102 to ~975, while the original takes ~3252ms.
Please be fair to anybody who's bothered to enter the competition and don't use a small dataset, if you do the results will not be valid as such a wide range of outside factors can cause a .5 millisecond difference in results and thus invalidate any results you get.
FYI, most consumer grade hard-drives have a seek time between 10 and 3ms.
Can you provide us with the method you're using to rank submissions?
When everything's closed all I have are your personal assurances that the results are accurate & reproducible.
Are you taking into account the amount of time spent in-kernel and subtracting that from the results? Is the standard deviation small enough to not affect the mean in a way that would change ranking? What's the condition of the system that the "official" results are being computed on? Are you running it as a high-priority process? Hopefully to negate the effects of the rest of the system.
Even with these answers, as i've said, all we have are your assurances.
lemmingmolester wrote: Can you provide us with the method you're using to rank submissions? The method has already been provided. If you want more details I can tell you. I've paired each script with the original so they each get executed together and I take t2final=1000*(t2-t) to calculate the speedup.
When everything's closed all I have are your personal assurances that the results are accurate & reproducible. True enough, thought you act like there's really something at stake here. It's not a presidential election. If somebody gets cheated, whoopdydoo they just missed out on a few CP. I'm not an admin, I'm doing this for you guys, not because I'm obligated to.
Is the standard deviation small enough to not affect the mean in a way that would change ranking? Some peoples' code has a larger standard deviation than others'. It depends on how they wrote it. For example, COM's code is pretty consistent with only about +-.3, while stdio's has a much larger variance.
Are you taking into account the amount of time spent in-kernel and subtracting that from the results? What's the condition of the system that the "official" results are being computed on? Are you running it as a high-priority process? Hopefully to negate the effects of the rest of the system. Since they're all being executed on the same machine, it wont really matter much whether they get slowed down or not. I'm using relative times. And I run the scripts quite a few times and observe the results so I can see how it's behaving. If I start getting results that seem wrong I close everything and try again.
True enough, thought you act like there's really something at stake here. It's not a presidential election. If somebody gets cheated, whoopdydoo they just missed out on a few CP. I'm not an admin, I'm doing this for you guys, not because I'm obligated to.
It's called objective criticism, hopefully you'll go do some research into high resolution timers and end up learning some interesting things instead.
At the moment it's a pissing contest between which side of a an interrupt that the main loop(s) finish on. Hell time.time() isn't even using high-resolution timers so you're only going to get accuracy down to about 0.015625 of a second with the rest of it being an experiment in ways to get pseudo-random floating point noise. Why aren't you at least using time.clock() ?
ynori7 : 1.05984735139
COM : 1.06420759881
Defience2 : 1.17680860859
Defience1 : 1.78079194828
just a panda : 2.13043020555
darkprincefrost : 2.24955297377
false : 2.36445404936
stdio1 : 4.15809439532
stdio2 : 4.35782889438
SwiftNomad : 9.78500797485
c4p_sl0ck : 11.5868839546
p4plus2 : 13.0557179698
lemmingmolester : 13.548157984
stealth- : 15.3230780889
Original : 38.8890140153
richo : 61.62740711```
Those are the times for all entries as of now (in order of speed) averaged over 50 executions on my windows PC. For the final testing I'll average over 100 executions on Linux.
After seeing your times, if anybody wants to make any last minute changes to their scripts you have until tomorrow. Note that Defience and stdio have two scripts in there, if you want to submit another script to see how it compares to your old one that's fine too.
This challenge has been fun even though I've only been trying to learn python for a few weeks and I only have college level training in c/c++ and VB.net
The best I've come up with so far is to set t=time again after the clunky for loop. Although, in doing that I've found a cool way to see where the bottle neck of the program was.
But that's not really optimizing, that's hacking in the loosest sense of the word. And I didn't even type it out, I just copied and pasted it so that's really a scriptkiddy move.
Needless to say when I gave this challenge a real effort I ended up with a pickledump in my dict.
E-5 wrote: The best I've come up with so far is to set t=time again after the clunky for loop. Although, in doing that I've found a cool way to see where the bottle neck of the program was. Well you can go ahead and try. I wont be using that to measure the time in the final testing, so it'll only slow it down further.
Alright everybody, here are the results of the tests:
ynori7 : 2.05816455846 stdio3 : 2.06205400286 otani : 2.06363439859 just a panda : 2.0648091261 COM : 2.13726479348 Defience2 : 2.29994460967 Defience1 : 3.47828702505 darkprincefrost : 4.3148679887 false : 4.535014478 stdio1 : 7.81700462582 SwiftNomad : 19.3935991771 c4p_sl0ck : 23.0868793168 p4plus2 : 26.4258313223 lemmingmolester : 26.9397008787 stealth- : 29.9493937423 Original : 78.7082909104 richo : 129.59382722 The number there is the time it took for your code to execute 100 times measured in seconds. I ended up doing the final tests in Windows because some people's scripts didn't work on my Linux machine (possibly because it had python 2.5). Regardless, the method I used to time them is accurate in Windows as well.
The winners are: 1st Place: stdio 2nd Place: otani 3rd Place: just a panda
I received permission from the following people to post their code: me, richo, com, false, defience, stealth-, swiftnomad, lemmingmolester, stdio, p4plus2. If any of the others want to post their code as well, go ahead.
Here's the script I used to test all the code (minus those who didn't give permission to post theirs). Entries are in alphabetical order.
############ORIGINAL SCRIPT#################
############################################
def original():
file=open('input.txt', 'r')
st=file.read()
zeroes=ones=twos=threes=fours=fives=sixes=sevens=eights=nines=0
repeats=False
#Determines how many of each number there are in the string
for x in range(0, len(st)):
if(st[x]=='0'):
zeroes+=1
if(st[x]=='1'):
ones+=1
if(st[x]=='2'):
twos+=1
if(st[x]=='3'):
threes+=1
if(st[x]=='4'):
fours+=1
if(st[x]=='5'):
fives+=1
if(st[x]=='6'):
sixes+=1
if(st[x]=='7'):
sevens+=1
if(st[x]=='8'):
eights+=1
if(st[x]=='9'):
nines+=1
counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]
#Determines if there are any duplicates in the array of counts
for x in range(0, len(counts)):
for y in range(0, len(counts)):
if(x==y): continue
if(counts[x]==counts[y]): repeats=True
#Print results
for x in range(0, len(counts)):
print "There are "+repr(counts[x])+" "+repr(x)+"'s"
if(repeats==True):
print "There was at least one repeat"
else:
print "There were no repeats"
############################################
###########OPTIMIZED SCRIPTS################
############################################
def COM():
file=open('input.txt', 'r')
str=file.read()
ar=[repr(str.count("0")),repr(str.count("1")),repr(str.count("2")),repr(str.count("3")),repr(str.count("4")),repr(str.count("5")),repr(str.count("6")),repr(str.count("7")),repr(str.count("8")),repr(str.count("9"))]
a="There are "
c="'s\n"
if(len(dict.fromkeys(ar))!=10):
print a+ar[0]+" 0"+c+a+ar[1]+" 1"+c+a+ar[2]+" 2"+c+a+ar[3]+" 3"+c+a+ar[4]+" 4"+c+a+ar[5]+" 5"+c+a+ar[6]+" 6"+c+a+ar[7]+" 7"+c+a+ar[8]+" 8"+c+a+ar[9]+" 9"+c+"There was at least one repeat"
else:
print a+ar[0]+" 0"+c+a+ar[1]+" 1"+c+a+ar[2]+" 2"+c+a+ar[3]+" 3"+c+a+ar[4]+" 4"+c+a+ar[5]+" 5"+c+a+ar[6]+" 6"+c+a+ar[7]+" 7"+c+a+ar[8]+" 8"+c+a+ar[9]+" 9"+c+"There were no repeats"
##############################################
def Defience1():
f=open('input.txt', 'r')
s=f.read()
print("\n".join(("There are %d: %d" % (s.count(str(i)), i)) for i in range(10)))
a=(s.count('0'),s.count('1'),s.count('2'),s.count('3'),s.count('4'),s.count('5'),s.count('6'),s.count('7'),s.count('8'),s.count('9'))
b=set(a)
if len(b)!=len(a):
print "There was at least one repeat"
else:
print "There were no repeats"
################################################
def Defience2():
f=open('input.txt', 'r')
s=f.read()
counts=[s.count(str(i)) for i in range(10)]
print("\n".join(("There are %d: %d" % ((c, i)) for (i, c) in enumerate(counts))))
b=set(counts)
if len(b) != len(counts):
print "at least one duplicate found"
else:
print "None found"
#############################################
def false():
import sys
orefs = [sys.getrefcount(x) for x in map(str, range(10))]
file=open('input.txt')
data=list(file.read())
file.close()
urefs = [sys.getrefcount(x) for x in map(str, range(10))]
drefs = map(lambda x: x[0]-x[1], zip(urefs, orefs))
i = 0
for v in drefs:
print "There are {0} {1}'s".format(v, i)
i += 1
if len({}.fromkeys(drefs)) != len(drefs):
print "There was at least one repeat"
else:
print "There where no repeats"
##############################################
def lemmingmolester():
cunt = [0,0,0,0,0,0,0,0,0,0]
str = open('input.txt', 'r').read()
arse = 0
for char in str:
arse = ord(char) - 48
if (arse >= 0) and (arse <= 9):
cunt[arse] = cunt[arse] + 1
for x in range(0, len(cunt)):
print "There are " + repr(cunt[x]) + " " + repr(x) + "'s"
if len({}.fromkeys(cunt).keys()) != len(cunt):
print "There was at least one repeat"
else:
print "There were no repeats"
###############################################
def p4plus2():
file=open('input.txt', 'r')
s=file.read()
a = [0,0,0,0,0,0,0,0,0,0]
r = l = len(s)
for x in range(0, l):
a[ord(s[x])-48]+=1
for x in range(0, 10):
for y in range(0, 10):
if(x==y): continue
if(a[x]==a[y]): r=1
print 'There are 0 '+repr(a[0])+"'s"
print 'There are 1 '+repr(a[1])+"'s"
print 'There are 2 '+repr(a[2])+"'s"
print 'There are 3 '+repr(a[3])+"'s"
print 'There are 4 '+repr(a[4])+"'s"
print 'There are 5 '+repr(a[5])+"'s"
print 'There are 6 '+repr(a[6])+"'s"
print 'There are 7 '+repr(a[7])+"'s"
print 'There are 8 '+repr(a[8])+"'s"
print 'There are 9 '+repr(a[9])+"'s"
if(r==1):
print 'There was at least one repeat'
else:
print 'There were no repeats'
#############################################
def richo():
file=open('input.txt', 'r')
counts = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
repeats=False
last = -1
#Determines how many of each number there are in the string
for line in file:
for x in line:
y = int(x)
try:
counts[y] += 1
if last == y: repeats = True
except IndexError:# Somethign wasn't a number
pass # I don't care
#Print results
for x in range(10):
print "There are %i %i's" % (counts[x], x)
if(repeats==True):
print "There was at least one repeat"
else:
print "There were no repeats"
##################################################
def stdio1():
import re
f=open('input.txt', 'r')
a=f.read()
j=[]
b=re.compile(str('0'))
j.append(len(b.findall(a)))
b=re.compile(str('1'))
j.append(len(b.findall(a)))
b=re.compile(str('2'))
j.append(len(b.findall(a)))
b=re.compile(str('3'))
j.append(len(b.findall(a)))
b=re.compile(str('4'))
j.append(len(b.findall(a)))
b=re.compile(str('5'))
j.append(len(b.findall(a)))
b=re.compile(str('6'))
j.append(len(b.findall(a)))
b=re.compile(str('7'))
j.append(len(b.findall(a)))
b=re.compile(str('8'))
j.append(len(b.findall(a)))
b=re.compile(str('9'))
j.append(len(b.findall(a)))
if len(j)==len(set(j)):
print "There are %s 0's\n\
There are %d 1's\n\
There are %d 2's\n\
There are %d 3's\n\
There are %d 4's\n\
There are %d 5's\n\
There are %d 6's\n\
There are %d 7's\n\
There are %d 8's\n\
There are %d 9's\nThere are no repeats" % \
(j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
else:
print "There are %s 0's\n\
There are %d 1's\n\
There are %d 2's\n\
There are %d 3's\n\
There are %d 4's\n\
There are %d 5's\n\
There are %d 6's\n\
There are %d 7's\n\
There are %d 8's\n\
There are %d 9's\nThere was atleast one repeat" % \
(j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
#################################################
def stdio3():
f=open('input.txt', 'r')
a=f.read()
j=a.count("0"),a.count("1"),a.count("2"),a.count("3"),a.count("4"),a.count("5"),a.count("6"),a.count("7"),a.count("8"),a.count("9")
if len(j)==len(set(j)):
print "There are %s 0's\n\
There are %d 1's\n\
There are %d 2's\n\
There are %d 3's\n\
There are %d 4's\n\
There are %d 5's\n\
There are %d 6's\n\
There are %d 7's\n\
There are %d 8's\n\
There are %d 9's\nThere are no repeats" % \
(j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
else:
print "There are %s 0's\n\
There are %d 1's\n\
There are %d 2's\n\
There are %d 3's\n\
There are %d 4's\n\
There are %d 5's\n\
There are %d 6's\n\
There are %d 7's\n\
There are %d 8's\n\
There are %d 9's\nThere was at least 1 repeat" % \
(j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
##############################################
def stealth():
file=open('input.txt')
zeroes=0
ones=0
twos=0
threes=0
fours=0
fives=0
sixes=0
sevens=0
eights=0
nines=0
repeats=0
for c in file.read():
if c=='0': zeroes+=1
elif c=='1': ones+=1
elif c=='2': twos+=1
elif c=='3': threes+=1
elif c=='4': fours+=1
elif c=='5': fives+=1
elif c=='6': sixes+=1
elif c=='7': sevens+=1
elif c=='8': eights+=1
elif c=='9': nines+=1
counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]
a = range(10)
for x in a:
for y in a:
if(x==y): continue
elif(counts[x]==counts[y]): repeats=1; break
for x in a: print "There are %d %d's" % (counts[x],x)
if repeats: print "There was at least one repeat"
else: print "There were no repeats"
##############################################
def SwiftNomad():
import re
numbers = dict()
numbers = {'\n': 0, '0': 0, '1': 0, '2': 0, '3': 0, '4': 0, '5': 0, '6': 0, '7': 0, '8': 0, '9': 0 }
file = open('input.txt', 'r')
str=file.read()
str = re.sub("[^0-9]", "", str)
repeats = False
total = len(str)
for x in range(0, total):
numbers[str[x]] += 1
for x in numbers:
for y in numbers:
if(x == y):
continue
elif(numbers[x] == numbers[y]):
repeats = True
print 'There are %d 0\'s' % (numbers['0'])
print 'There are %d 1\'s' % (numbers['1'])
print 'There are %d 2\'s' % (numbers['2'])
print 'There are %d 3\'s' % (numbers['3'])
print 'There are %d 4\'s' % (numbers['4'])
print 'There are %d 5\'s' % (numbers['5'])
print 'There are %d 6\'s' % (numbers['6'])
print 'There are %d 7\'s' % (numbers['7'])
print 'There are %d 8\'s' % (numbers['8'])
print 'There are %d 9\'s' % (numbers['9'])
if(repeats==True):
print "There was at least one repeat"
else:
print "There were no repeats"
#################################################
def ynori7():
file=open('input.txt', 'r')
st=file.read()
ar=[]
ar+=[st.count("0")]
ar+=[st.count("1")]
ar+=[st.count("2")]
ar+=[st.count("3")]
ar+=[st.count("4")]
ar+=[st.count("5")]
ar+=[st.count("6")]
ar+=[st.count("7")]
ar+=[st.count("8")]
ar+=[st.count("9")]
ar=map(str, ar)
if(len(set(ar))<10):
out = "There was at least one repeat"
else:
out = "There were no repeats"
out="".join(map(lambda b,n:"There are "+b+" "+n+"'s\n", ar, map(str, range(0, 10))))+out
print out
#################################################
if __name__=='__main__':
from timeit import Timer
times={}
t = Timer("original()", "from __main__ import original")
times["Original"]=t.timeit(100)
t = Timer("COM()", "from __main__ import COM")
times["COM"]=t.timeit(100)
t = Timer("ynori7()", "from __main__ import ynori7")
times["ynori7"]=t.timeit(100)
t = Timer("stealth()", "from __main__ import stealth")
times["stealth-"]=t.timeit(100)
t = Timer("stdio1()", "from __main__ import stdio1")
times["stdio1"]=t.timeit(100)
t = Timer("stdio3()", "from __main__ import stdio3")
times["stdio3"]=t.timeit(100)
t = Timer("p4plus2()", "from __main__ import p4plus2")
times["p4plus2"]=t.timeit(100)
t = Timer("Defience1()", "from __main__ import Defience1")
times["Defience1"]=t.timeit(100)
t = Timer("Defience2()", "from __main__ import Defience2")
times["Defience2"]=t.timeit(100)
t = Timer("SwiftNomad()", "from __main__ import SwiftNomad")
times["SwiftNomad"]=t.timeit(100)
t = Timer("false()", "from __main__ import false")
times["false"]=t.timeit(100)
t = Timer("richo()", "from __main__ import richo")
times["richo"]=t.timeit(100)
t = Timer("lemmingmolester()", "from __main__ import lemmingmolester")
times["lemmingmolester"]=t.timeit(100)
items=times.items()
items.sort(lambda x,y: cmp(x[1], y[1]))
for x in items:
print str(x[0])+" : "+str(x[1])
stdio wrote: Anyways thanks for running this. It seems that all of the top speed ones used count() – after trying many other ways, this did seem to be the best. Yeah, though I like some of the other solutions too even though they didn't win. I particularly like false's solution because I never would have thought of using a method like that. As for my own solution, I'm surprised it did as well as it did since I was just trying to do something different than everyone else.
Anyway, thanks to everybody who participated. The winners will get their prizes as soon as I'm able to contact Mr Cheese.
Since I wasn't able to give permission to post my code, here it is. http://www.otanii.com/otani-optimized.py