Welcome to HBH! If you have tried to register and didn't get a verification email, please using the following link to resend the verification email.

HBH Python Competition


ynori7's Avatar
Future Emperor of Earth
0 0

Hey guys, I've noticed there's been a lot of inactivity here on HBH lately, so I got Mr_Cheese to approve a new code optimization competition.

The Rules:

  • The script can be edited as much as you want between the "Begin Optimization" and "End Optimization" comments.
  • The script must take input from a text file. Don't assume I'll use the same data when I test your code.
  • If two identical scripts are turned in, the first delivered will be the one participating.
  • Everyone can participate (even administrators). I won't be participating since I wrote the script.
  • The winner will be the person with the least execution time on the script.
  • The scripts will tested multiple times and I will take the average time.

Prizes: 1st - 100 challenge points. 2nd - 75 challenge points. 3rd - 50 challenge points. All winners will also have a cool little recognition stated on their profiles.

All submissions must be in by November 6th.

If anything here is unclear, feel free to ask me via pm, or here in this thread. This thread is for questions concerning this topic only.

The script reads a long string of numbers from a file, and then counts the number of ones, twos, threes, etc. It then puts those into a list and determines if there are any duplicates (e.g. if there are 10 ones and 10 twos). Then the program prints the results.

Here is the code:

t=time.time()
################################
#########BEGIN OPTIMIZATION########
################################
file=open('input.txt', 'r')
str=file.read()

zeroes=ones=twos=threes=fours=fives=sixes=sevens=eights=nines=0
repeats=False

#Determines how many of each number there are in the string
for x in range(0, len(str)):
	if(str[x]=='0'):
		zeroes+=1
	if(str[x]=='1'):
		ones+=1
	if(str[x]=='2'):
		twos+=1
	if(str[x]=='3'):
		threes+=1
	if(str[x]=='4'):
		fours+=1
	if(str[x]=='5'):
		fives+=1
	if(str[x]=='6'):
		sixes+=1
	if(str[x]=='7'):
		sevens+=1
	if(str[x]=='8'):
		eights+=1
	if(str[x]=='9'):
		nines+=1

counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]

#Determines if there are any duplicates in the array of counts
for x in range(0, len(counts)):
	for y in range(0, len(counts)):
		if(x==y): continue
		if(counts[x]==counts[y]): repeats=True

#Print results
for x in range(0, len(counts)):
	print "There are "+repr(counts[x])+" "+repr(x)+"'s"
if(repeats==True):
	print "There was at least one repeat"
else:
	print "There were no repeats"
	
################################
########END OPTIMIZATION########
################################
#calculate and show time usage
t2=time.time()
print "Time elapsed: "+repr(1000*(t2-t)) #microseconds

And here is the data (input.txt):

The output should look like:

There are 1978 1's
There are 1891 2's
There are 1621 3's
There are 1864 4's
There are 2020 5's
There are 1503 6's
There are 1837 7's
There are 1549 8's
There are 1078 9's
There were no repeats```

**Current Rankings:**
They're getting pretty close. I'm going to post a few extra people in the rankings:
Me - 4.7528489854350002
COM - 4.7472755556899999
stdio - 4.7471755556899999
just a panda - 4.7315940047550002
otani - 4.58821655726
Defience - 4.3426556233060003
Original Script - 1

The rankings are measured in "x times speedup".
Note, I'm not in the competition, I just put mine into the ranking for comparison.

stealth-'s Avatar
Ninja Extreme
0 0

I'm assuming we only get one submission? Say, for instance that I hand mine in because I didn't want someone to hand there's in before me and have the same code, but then later I find out because I was rushing I missed a huge optimisation. Could I hand in a replacement?


korg's Avatar
Admin from hell
0 0

Ah, Python I'll be in this for sure. I don't see why you can't submit more than one, We're going for time not quantity.


ghost's Avatar
0 0

Im assuming from looking at your code, you are using python 2.6 not 3.0 correct?


pimpim's Avatar
Member
0 0

stdio wrote: Im assuming from looking at your code, you are using python 2.6 not 3.0 correct?

Looks like 2.X

Python <3


ynori7's Avatar
Future Emperor of Earth
0 0

stealth- wrote: I'm assuming we only get one submission? Say, for instance that I hand mine in because I didn't want someone to hand there's in before me and have the same code, but then later I find out because I was rushing I missed a huge optimisation. Could I hand in a replacement? How about I use whichever version gives the best time?

And i'm running python 2.5. That should be fine with any 2.X since there shouldn't be any need for extra libraries. Python 3 sucks.


ghost's Avatar
0 0

Awesome, I'll be sure to give it a shot.


ghost's Avatar
0 0

as an added after thought while writing my optimization. Generally its bad programming to name variables the same as defined functions. IE you labeled as your input sting as "str" when str() is a function in python. Just an observational note. Not sure if it was intentional or not.


ynori7's Avatar
Future Emperor of Earth
0 0

Also, since there can be only 3 winners, ties (or code with very close times) will also be judged by eye (I'll look for other optimizations that are present that may not have made a noticeable difference, and clever solutions).


ghost's Avatar
0 0

I assume by "repeats" you mean two identical numbers occur after eachother in the string?

ynori7 wrote: How about I use whichever version gives the best time?

And i'm running python 2.5. That should be fine with any 2.X since there shouldn't be any need for extra libraries. Python 3 sucks. My first tests on the unoptimized code shows me that python3 runs the script in half the time python2.6 does it.


ynori7's Avatar
Future Emperor of Earth
0 0

ctrl_ wrote: I assume by "repeats" you mean two identical numbers occur after eachother in the string? Say you have the list(1, 2, 3, 4, 3). In that list, 3 occurs twice. That is a repeat. I suppose "duplicates" would have been a better word.

My first tests on the unoptimized code shows me that python3 runs the script in half the time python2.6 does it. It doesn't matter, they'll all be tested on the same machine with the same version of python.


ghost's Avatar
0 0

ynori7 wrote: [quote]ctrl_ wrote: I assume by "repeats" you mean two identical numbers occur after eachother in the string? Say you have the list(1, 2, 3, 4, 3). In that list, 3 occurs twice. That is a repeat. I suppose "duplicates" would have been a better word. [/quote] This is where I'm getting confused. By this definition, wouldn't counting the number of occurrences be the same thing as counting how many "duplicates/repeats" there are?


ghost's Avatar
0 0

No no. Alright, say the string you read from the file is this "1234512". Now you total up the number of occurrences of each number: 2 ones 2 twos 1 three 1 four 1 five Now, those are stored into a list like this [2, 2, 1, 1, 1]. You now have duplicates in that list. There are two 2's and three 1's. That is what the second part of the algorithm checks.


ynori7's Avatar
Future Emperor of Earth
0 0

Indeed. Like Aphophis showed, the second part of the algorithm detects whether or not there are any duplicates in the counts list. If you need any further clarification, don't hesitate to ask.


ghost's Avatar
0 0

Hey I had a suggestion, though it does mean slightly more work for you Ynori7.

Looking at the past php competition for example, everyone submitted their code and at the end winners were announced. This doesn't seem to competitive, or at least I would like to see it more so.

Would it be possible to update the times of the top 3 on a weekly basis? You gave us 1 month to rewrite code that takes about ~20min. Knowing the current standings and times to beat, would be more incentive for both myself and others who submit optimizations to rethink the way they did things and in the end, we will end up with better code.


ynori7's Avatar
Future Emperor of Earth
0 0

stdio wrote: Would it be possible to update the times of the top 3 on a weekly basis? You gave us 1 month to rewrite code that takes about ~20min. Knowing the current standings and times to beat, would be more incentive for both myself and others who submit optimizations to rethink the way they did things and in the end, we will end up with better code. Well, posting the times wont all that helpful to people since times on my computer will be different than times on yours, but I could update the post every so often to say who the three current leaders are.

EDIT: Done. The current rankings are at the bottom of the original post.


stealth-'s Avatar
Ninja Extreme
0 0

stdio wrote: Hey I had a suggestion, though it does mean slightly more work for you Ynori7.

Looking at the past php competition for example, everyone submitted their code and at the end winners were announced. This doesn't seem to competitive, or at least I would like to see it more so.

Would it be possible to update the times of the top 3 on a weekly basis? You gave us 1 month to rewrite code that takes about ~20min. Knowing the current standings and times to beat, would be more incentive for both myself and others who submit optimizations to rethink the way they did things and in the end, we will end up with better code.

That was a good idea, now I can see I definitely need to work on mine. :)


ghost's Avatar
0 0

Im going to try my best to explain why this challenge is skewed. Bare with me and understand all times reported are off of my computer. Though I will relate them back to Ynori7's First lets look at the original code

import time
t=time.time()
################################
#######BEGING OPTIMIZATION######
################################
file=open(&#39;input.txt&#39;, &#39;r&#39;)
str=file.read()

zeroes=ones=twos=threes=fours=fives=sixes=sevens=eights=nines=0
repeats=False

#Determines how many of each number there are in the string
for x in range(0, len(str)):
if(str[x]==&#39;0&#39;):
zeroes+=1
if(str[x]==&#39;1&#39;):
ones+=1
if(str[x]==&#39;2&#39;):
twos+=1
if(str[x]==&#39;3&#39;):
threes+=1
if(str[x]==&#39;4&#39;):
fours+=1
if(str[x]==&#39;5&#39;):
fives+=1
if(str[x]==&#39;6&#39;):
sixes+=1
if(str[x]==&#39;7&#39;):
sevens+=1
if(str[x]==&#39;8&#39;):
eights+=1
if(str[x]==&#39;9&#39;):
nines+=1

counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]

#Determines if there are any duplicates in the array of counts
for x in range(0, len(counts)):
for y in range(0, len(counts)):
if(x==y): continue
if(counts[x]==counts[y]): repeats=True

#Print results
for x in range(0, len(counts)):
print &quot;There are &quot;+repr(counts[x])+&quot; &quot;+repr(x)+&quot;&#39;s&quot;
if(repeats==True):
print &quot;There was at least one repeat&quot;
else:
print &quot;There were no repeats&quot;

################################
########END OPTIMIZATION########
################################
#calculate and show time usage
t2=time.time()
print &quot;Time elapsed: &quot;+repr(1000*(t2-t)) #microseconds

I ran this 10 times with an avg t of about ~35us I then commented out the print lines and got the algorithm to run at an avg time of t 9.9us. This tells me that the algorithm can be improved.

so I redid the program (and yes Im giving out my answer to support my tests)

Edited for sake of competion

Again I ran it 10 times with an avg of t of 20us. I then choose to comment out the print lines and redo it. Now the avg time was too small to detect so giving me back a t of 0.0. So I ran the algorithm without the print in a loop a 1000 times. This time I was getting an avg time of t=399us. Which suggests that the algorithm part and the part that we are allowed to modify is running at .4us.

Now total I was at 20us on my machine and 19.6us of that is soley in the print function. Which is MANDATORY TO HAVE. Thats 98% of the time is out of my control.

Now lets look at the results, I ran at 42us and COM ran his at 20us. This suggests that either there is not the same print output being displayed, or more likely there is noise filling the data. By noise I mean things beyond our control.

To get a 50% reduction, the only thing I can change is the print function, but Im not allowed to do it. So really its just a matter of when was the computer restarted, shit to do with pythons garbage collection, lucky ass timing with cpus and a host of other shit thats out of control.

Just to test this again I ran the following.

import time
t=time.time()
################################
#######BEGING OPTIMIZATION######
################################
for i in range(0,11):
	print i
################################
########END OPTIMIZATION########
################################
#calculate and show time usage
t2=time.time()
print &quot;Time elapsed: &quot;+repr(1000*(t2-t)) #microseconds

This code is just printing 11 lines. Thats it and it runs at 20us on my machine. and my code above runs at 20us they are indistinguishable.

In conclusion, I have rewritten the algorithm from ~9ish us to less than 1us and the print function is fucking everything up. Unless you force the algorithm part (the part we can edit) to loop enough that the print timing becomes statistically insignificant, this challenge is bullshit.

Note: Im not saying my program is the best, Im saying the print function is fucking everything up to the point where I cant decipher how to improve it.

Also: This isnt too belittle the challenge at all or ynori7, just applying data that I ran to show that this is statistically impossible to actually optimize beyond a fraction of a microsecond and have the print fuck it up.

Edit: Also when I output to a text file and not use the print function, it again runs at less then 1us. Which is further support that I cant improve much beyond changing the print. Feel free to copy my code and test it for yourself.


ynori7's Avatar
Future Emperor of Earth
0 0

So what, you want to be able to remove the print statements to make it faster? Too bad. That's how life works. When you're in the industry and you're told to make your program run faster you can't just make it stop doing some of the things it's supposed to do.

Now, since the program HAS been optimized further than what you've done, you must be overlooking some things.

EDIT: and if you could remove your code from that post, that'd be nice. Don't want other people getting ideas from it.


ghost's Avatar
0 0

ynori7 wrote: So what, you want to be able to remove the print statements to make it faster? Too bad. That's how life works. When you're in the industry and you're told to make your program run faster you can't just make it stop doing some of the things it's supposed to do.

Now, since the program HAS been optimized further that what you've done, you must be overlooking some things.

At least back up your talk with data. I CANT, ITS STATISTICALLY IMPOSSIBLE. Though please, feel free to PROVE me wrong.


ghost's Avatar
0 0

The code is not using enough resources. Might I suggest altering the code, so that;

  1. it doesn't have any print statements
  2. people have to optimize some code that uses up far more processing cycles (don't use loops to make it more intensive)

I really hope this competition is going to be a success. Best of luck to all involved.


ghost's Avatar
0 0

stdio, when you tested that yourself, did you run it on windows?


ghost's Avatar
0 0

COM wrote: stdio, when you tested that yourself, did you run it on windows?

yes


ghost's Avatar
0 0

stdio wrote: [quote]COM wrote: stdio, when you tested that yourself, did you run it on windows?

yes[/quote] Ah, then I understand. When I tested mine on windows, the resulting times were very inaccurate and fucked up. Once tested on linux, the times are more precise and actually comparable.

Edit: So you know what I mean, on windows this is what I get: alternating times between 15 and 16 or just 0.0

On linux, I get very precis times, which for me are different from what ynori gets (mine shows between 0.4 and 1.1), but at least there it's comparable the way that it shows a more specific answer.


ghost's Avatar
0 0

COM wrote: [quote]stdio wrote: [quote]COM wrote: stdio, when you tested that yourself, did you run it on windows?

yes[/quote] Ah, then I understand. When I tested mine on windows, the resulting times were very inaccurate and fucked up. Once tested on linux, the times are more precise and actually comparable.

Edit: So you know what I mean, on windows this is what I get: alternating times between 15 and 16 or just 0.0

On linux, I get very precis times, which for me are different from what ynori gets (mine shows between 0.4 and 1.1), but at least there it's comparable the way that it shows a more specific answer.[/quote]

A response I will buy, I was getting around 2-3us on mine in linux. This at least allows me to more accurately track my program. To the point where its not impossible to tell the difference. Though I still want to reiterate what spy said and that this program doesn't use enough cpu cycles. Also if it is being tested in windows, then the output times will obviously be far more biased on luck.

All that said at least I have a place to start to rethink my solution.


ghost's Avatar
0 0

stdio wrote: Also if it is being tested in windows, then the output times will obviously be far more biased on luck. Just in case you're referring to ynori's tests, no, they're not done on windows.


ghost's Avatar
0 0

Then you can tell people generally that a faster design doesn't necessarily mean the best design. No need to tell them how to go about doing it.

Fine fine, What COM said ^^


ghost's Avatar
0 0

stdio, I don't think you should be handing out these instructions, it's an individual task, remember?


ghost's Avatar
0 0

f16e7 wrote: stdio, I don't think you should be handing out these instructions, it's an individual task, remember?

True but it supports that just because something may be a fraction of a microsecond faster, doesnt make it the best possible code. If I wanted timing to be the best I would rewrite the damn thing in assembly. People rarely program in this anymore because its an inefficient use of resources (peoples time), not because it isnt the fastest means of execution.


ghost's Avatar
0 0

stdio wrote: True but it supports that just because something may be a fraction of a microsecond faster, doesnt make it the best possible code. If I wanted timing to be the best I would rewrite the damn thing in assembly. People rarely program in this anymore because its an inefficient use of resources (peoples time), not because it isnt the fastest means of execution. Then you can tell people generally that a faster design doesn't necessarily mean the best design. No need to tell them how to go about doing it.


ynori7's Avatar
Future Emperor of Earth
0 0

I just changed how I'm testing the scripts to get a more accurate reading. I'm now running the original script along with each of your scripts (to account for ups and downs in my processor) and I then measure the speedup compared to the results from the original script.

The rankings are updated now with the new comparison method and new submissions.


ynori7's Avatar
Future Emperor of Earth
0 0

MoshBat wrote: Do note that running the script three or so times, then taking a mean would also create a greater degree of accuracy. I average them over 10 executions.


ghost's Avatar
0 0

Any chance of posting times for all submissions, or will it just be the top 3?


ynori7's Avatar
Future Emperor of Earth
0 0

Defience wrote: Any chance of posting times for all submissions, or will it just be the top 3? I'll do that once I do the final measurements. Also, once the winners are announced anybody who wants to is encouraged to post their solutions so others can see. Not now though.


ghost's Avatar
0 0

Cool. It would be interesting to see other's solutions and ideas as well.


richohealey's Avatar
Python Ninja
0 0

I've not got much on at work- I'll take a look.


richohealey's Avatar
Python Ninja
0 0

I wasn't paying attention, where am I sending my solution?

I'm tempted to just post it up. It probably won't win (in case you guys haven't caught on python favours readable code over fast code).

Also, you never mentioned if you've got psycho installed, because if you do the first smartass to

try:
    import psycho
    psycho.full()

Will win :P

In any case I've finished with my entry, it's not too bad.


ynori7's Avatar
Future Emperor of Earth
0 0

richohealey wrote: I wasn't paying attention, where am I sending my solution? Just PM or email it to me (code tags, link, whatever).

And I don't have psyco installed.


ghost's Avatar
0 0

I keep getting different times each time I run it.


ynori7's Avatar
Future Emperor of Earth
0 0

sunglasses333 wrote: I keep getting different times each time I run it. Some scripts have more variance than others. I've gotten a few that vary from 2x speedup to 6x, and others that are a constant 4x. That's why I'm averaging over 10 executions.


ghost's Avatar
0 0

10 runs? Holy shit. I'd rather you used 100 runs, or even better, a thousand. Why the small amount of executions?


ynori7's Avatar
Future Emperor of Earth
0 0

bdafae wrote: 10 runs? Holy shit. I'd rather you used 100 runs, or even better, a thousand. Why the small amount of executions? Because the way python manages its memory, if I automate the executions it get's 0 for the runtime about 80% of the time. I tried to figure a way to flush the memory after each execution, but it didn't seem to work.

EDIT: And I do give everyone the benefit of the doubt. If I start getting numbers that look wrong (way too fast or way too slow) I throw the results out and try again. I also run the script far more than 10 times and briefly scan over the results to make sure they're consistent.


ghost's Avatar
0 0

I'm trying to rack my brains to remember a python optimization tool that checked run speed through out the script running to identify bottlenecks that might be useful in this circumstance?!

I'm not involved in this I know, but I agree 10 times is way to small to try and identify such a small gap in the run times. I know it sounds like a lot of work, but hoe about posting a range along side the average?


ghost's Avatar
0 0

buttmonkey wrote: I'm trying to rack my brains to remember a python optimization tool that checked run speed through out the script running to identify bottlenecks that might be useful in this circumstance?! I believe Psyco does this.

When I was testing my code, I wrote a simple shell script that ran my script and the original, one after the other (with a pause between) and wrote the execution times of each to it's own file. Then I simply wrote a Python script to show the averages of each script. Just an idea. ;)


ynori7's Avatar
Future Emperor of Earth
0 0

Only a few more days left, so anybody who wants to submit something better do it soon.


ghost's Avatar
0 0

ynori7 wrote: [quote]bdafae wrote: 10 runs? Holy shit. I'd rather you used 100 runs, or even better, a thousand. Why the small amount of executions? Because the way python manages its memory, if I automate the executions it get's 0 for the runtime about 80% of the time. I tried to figure a way to flush the memory after each execution, but it didn't seem to work.

EDIT: And I do give everyone the benefit of the doubt. If I start getting numbers that look wrong (way too fast or way too slow) I throw the results out and try again. I also run the script far more than 10 times and briefly scan over the results to make sure they're consistent.[/quote]

The easiest way to avoid this is to use a bigger dataset. At the moment the different between competitors is so small that it's statistically insignificant.

With an input.txt file containing 10mb of mixed alphanumeric characters the difference becomes quite noticeable, oh and it's statistically significant.

As an example, on a 1mb file a small tweak dropped my result from ~1102 to ~975, while the original takes ~3252ms.

Please be fair to anybody who's bothered to enter the competition and don't use a small dataset, if you do the results will not be valid as such a wide range of outside factors can cause a .5 millisecond difference in results and thus invalidate any results you get.

FYI, most consumer grade hard-drives have a seek time between 10 and 3ms.


ynori7's Avatar
Future Emperor of Earth
0 0

I'm already using a slightly different input than the original. Right now all my current testing is there for is so people have some idea where they stand. I'll use a bigger input file when I do the final testing.


ghost's Avatar
0 0

Can you provide us with the method you're using to rank submissions?

When everything's closed all I have are your personal assurances that the results are accurate & reproducible.

Are you taking into account the amount of time spent in-kernel and subtracting that from the results? Is the standard deviation small enough to not affect the mean in a way that would change ranking? What's the condition of the system that the "official" results are being computed on? Are you running it as a high-priority process? Hopefully to negate the effects of the rest of the system.

Even with these answers, as i've said, all we have are your assurances.


ynori7's Avatar
Future Emperor of Earth
0 0

lemmingmolester wrote: Can you provide us with the method you're using to rank submissions? The method has already been provided. If you want more details I can tell you. I've paired each script with the original so they each get executed together and I take t2final=1000*(t2-t) to calculate the speedup.

When everything's closed all I have are your personal assurances that the results are accurate & reproducible. True enough, thought you act like there's really something at stake here. It's not a presidential election. If somebody gets cheated, whoopdydoo they just missed out on a few CP. I'm not an admin, I'm doing this for you guys, not because I'm obligated to.

Is the standard deviation small enough to not affect the mean in a way that would change ranking? Some peoples' code has a larger standard deviation than others'. It depends on how they wrote it. For example, COM's code is pretty consistent with only about +-.3, while stdio's has a much larger variance.

Are you taking into account the amount of time spent in-kernel and subtracting that from the results? What's the condition of the system that the "official" results are being computed on? Are you running it as a high-priority process? Hopefully to negate the effects of the rest of the system. Since they're all being executed on the same machine, it wont really matter much whether they get slowed down or not. I'm using relative times. And I run the scripts quite a few times and observe the results so I can see how it's behaving. If I start getting results that seem wrong I close everything and try again.


ghost's Avatar
0 0

True enough, thought you act like there's really something at stake here. It's not a presidential election. If somebody gets cheated, whoopdydoo they just missed out on a few CP. I'm not an admin, I'm doing this for you guys, not because I'm obligated to.

It's called objective criticism, hopefully you'll go do some research into high resolution timers and end up learning some interesting things instead.

At the moment it's a pissing contest between which side of a an interrupt that the main loop(s) finish on. Hell time.time() isn't even using high-resolution timers so you're only going to get accuracy down to about 0.015625 of a second with the rest of it being an experiment in ways to get pseudo-random floating point noise. Why aren't you at least using time.clock() ?


ynori7's Avatar
Future Emperor of Earth
0 0

lemmingmolester wrote: Why aren't you at least using time.clock() ?

I'm going to use the timeit module which uses clock() in linux. I used time() in the original script that everybody received since the result of clock is OS dependent.


ynori7's Avatar
Future Emperor of Earth
0 0
ynori7 : 1.05984735139
COM : 1.06420759881
Defience2 : 1.17680860859
Defience1 : 1.78079194828
just a panda : 2.13043020555
darkprincefrost : 2.24955297377
false : 2.36445404936
stdio1 : 4.15809439532
stdio2 : 4.35782889438
SwiftNomad : 9.78500797485
c4p_sl0ck : 11.5868839546
p4plus2 : 13.0557179698
lemmingmolester : 13.548157984
stealth- : 15.3230780889
Original : 38.8890140153
richo : 61.62740711```
Those are the times for all entries as of now (in order of speed) averaged over 50 executions on my windows PC. For the final testing I&#39;ll average over 100 executions on Linux.

After seeing your times, if anybody wants to make any last minute changes to their scripts you have until tomorrow. Note that Defience and stdio have two scripts in there, if you want to submit another script to see how it compares to your old one that&#39;s fine too.

ghost's Avatar
0 0

time.clock() is better for Windows, time.time() is better for Linux.

From the timeit module:markupif sys.platform == &quot;win32&quot;: # On Windows, the best timer is time.clock() default_timer = time.clock else: # On most other platforms the best timer is time.time() default_timer = time.time


ghost's Avatar
0 0

It will be interesting to see the other entries :)


ghost's Avatar
0 0

ynori7, thanks for taking the time to host this competition!


ynori7's Avatar
Future Emperor of Earth
0 0

Defience wrote: ynori7, thanks for taking the time to host this competition! No problem. I'm just trying to keep people active. I'll test everybody's scripts and post the results tomorrow (i.e. in about 20 hours).


ghost's Avatar
0 0

This challenge has been fun even though I've only been trying to learn python for a few weeks and I only have college level training in c/c++ and VB.net

The best I've come up with so far is to set t=time again after the clunky for loop. Although, in doing that I've found a cool way to see where the bottle neck of the program was.

But that's not really optimizing, that's hacking in the loosest sense of the word. And I didn't even type it out, I just copied and pasted it so that's really a scriptkiddy move.

Needless to say when I gave this challenge a real effort I ended up with a pickledump in my dict.


ynori7's Avatar
Future Emperor of Earth
0 0

E-5 wrote: The best I've come up with so far is to set t=time again after the clunky for loop. Although, in doing that I've found a cool way to see where the bottle neck of the program was. Well you can go ahead and try. I wont be using that to measure the time in the final testing, so it'll only slow it down further.


ynori7's Avatar
Future Emperor of Earth
0 0

Alright everyone. The competition is now closed. I begin testing shortly and I'll post the results and people's code in a bit.


ynori7's Avatar
Future Emperor of Earth
0 0

Alright everybody, here are the results of the tests:

ynori7 : 2.05816455846 stdio3 : 2.06205400286 otani : 2.06363439859 just a panda : 2.0648091261 COM : 2.13726479348 Defience2 : 2.29994460967 Defience1 : 3.47828702505 darkprincefrost : 4.3148679887 false : 4.535014478 stdio1 : 7.81700462582 SwiftNomad : 19.3935991771 c4p_sl0ck : 23.0868793168 p4plus2 : 26.4258313223 lemmingmolester : 26.9397008787 stealth- : 29.9493937423 Original : 78.7082909104 richo : 129.59382722 The number there is the time it took for your code to execute 100 times measured in seconds. I ended up doing the final tests in Windows because some people's scripts didn't work on my Linux machine (possibly because it had python 2.5). Regardless, the method I used to time them is accurate in Windows as well.

The winners are: 1st Place: stdio 2nd Place: otani 3rd Place: just a panda

I received permission from the following people to post their code: me, richo, com, false, defience, stealth-, swiftnomad, lemmingmolester, stdio, p4plus2. If any of the others want to post their code as well, go ahead.

Here's the script I used to test all the code (minus those who didn't give permission to post theirs). Entries are in alphabetical order.

############ORIGINAL SCRIPT#################
############################################
def original():
    file=open(&#39;input.txt&#39;, &#39;r&#39;)
    st=file.read()

    zeroes=ones=twos=threes=fours=fives=sixes=sevens=eights=nines=0
    repeats=False

    #Determines how many of each number there are in the string
    for x in range(0, len(st)):
            if(st[x]==&#39;0&#39;):
                    zeroes+=1
            if(st[x]==&#39;1&#39;):
                    ones+=1
            if(st[x]==&#39;2&#39;):
                    twos+=1
            if(st[x]==&#39;3&#39;):
                    threes+=1
            if(st[x]==&#39;4&#39;):
                    fours+=1
            if(st[x]==&#39;5&#39;):
                    fives+=1
            if(st[x]==&#39;6&#39;):
                    sixes+=1
            if(st[x]==&#39;7&#39;):
                    sevens+=1
            if(st[x]==&#39;8&#39;):
                    eights+=1
            if(st[x]==&#39;9&#39;):
                    nines+=1

    counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]

    #Determines if there are any duplicates in the array of counts
    for x in range(0, len(counts)):
            for y in range(0, len(counts)):
                    if(x==y): continue
                    if(counts[x]==counts[y]): repeats=True

    #Print results
    for x in range(0, len(counts)):
            print &quot;There are &quot;+repr(counts[x])+&quot; &quot;+repr(x)+&quot;&#39;s&quot;
    if(repeats==True):
            print &quot;There was at least one repeat&quot;
    else:
            print &quot;There were no repeats&quot;
############################################
###########OPTIMIZED SCRIPTS################
############################################
def COM():
    file=open(&#39;input.txt&#39;, &#39;r&#39;)
    str=file.read()
    ar=[repr(str.count(&quot;0&quot;)),repr(str.count(&quot;1&quot;)),repr(str.count(&quot;2&quot;)),repr(str.count(&quot;3&quot;)),repr(str.count(&quot;4&quot;)),repr(str.count(&quot;5&quot;)),repr(str.count(&quot;6&quot;)),repr(str.count(&quot;7&quot;)),repr(str.count(&quot;8&quot;)),repr(str.count(&quot;9&quot;))]
    a=&quot;There are &quot;
    c=&quot;&#39;s&#92;n&quot;

    if(len(dict.fromkeys(ar))!=10):
            print a+ar[0]+&quot; 0&quot;+c+a+ar[1]+&quot; 1&quot;+c+a+ar[2]+&quot; 2&quot;+c+a+ar[3]+&quot; 3&quot;+c+a+ar[4]+&quot; 4&quot;+c+a+ar[5]+&quot; 5&quot;+c+a+ar[6]+&quot; 6&quot;+c+a+ar[7]+&quot; 7&quot;+c+a+ar[8]+&quot; 8&quot;+c+a+ar[9]+&quot; 9&quot;+c+&quot;There was at least one repeat&quot;
    else:
            print a+ar[0]+&quot; 0&quot;+c+a+ar[1]+&quot; 1&quot;+c+a+ar[2]+&quot; 2&quot;+c+a+ar[3]+&quot; 3&quot;+c+a+ar[4]+&quot; 4&quot;+c+a+ar[5]+&quot; 5&quot;+c+a+ar[6]+&quot; 6&quot;+c+a+ar[7]+&quot; 7&quot;+c+a+ar[8]+&quot; 8&quot;+c+a+ar[9]+&quot; 9&quot;+c+&quot;There were no repeats&quot;
##############################################
def Defience1():
    f=open(&#39;input.txt&#39;, &#39;r&#39;)
    s=f.read()
    print(&quot;&#92;n&quot;.join((&quot;There are %d: %d&quot; % (s.count(str(i)), i)) for i in range(10)))
    a=(s.count(&#39;0&#39;),s.count(&#39;1&#39;),s.count(&#39;2&#39;),s.count(&#39;3&#39;),s.count(&#39;4&#39;),s.count(&#39;5&#39;),s.count(&#39;6&#39;),s.count(&#39;7&#39;),s.count(&#39;8&#39;),s.count(&#39;9&#39;))
    b=set(a)
    if len(b)!=len(a):
        print &quot;There was at least one repeat&quot;
    else:
       print &quot;There were no repeats&quot;
################################################
def Defience2():
    f=open(&#39;input.txt&#39;, &#39;r&#39;)
    s=f.read()
    counts=[s.count(str(i)) for i in range(10)]
    print(&quot;&#92;n&quot;.join((&quot;There are %d: %d&quot; % ((c, i)) for (i, c) in enumerate(counts))))
    b=set(counts)
    if len(b) != len(counts):
       print &quot;at least one duplicate found&quot;
    else:
       print &quot;None found&quot;
#############################################
def false():
    import sys

    orefs = [sys.getrefcount(x) for x in map(str, range(10))]

    file=open(&#39;input.txt&#39;)
    data=list(file.read())
    file.close()

    urefs = [sys.getrefcount(x) for x in map(str, range(10))]
    drefs = map(lambda x: x[0]-x[1], zip(urefs, orefs))

    i = 0
    for v in drefs:
        print &quot;There are {0} {1}&#39;s&quot;.format(v, i)
        i += 1

    if len({}.fromkeys(drefs)) != len(drefs):
        print &quot;There was at least one repeat&quot;
    else:
        print &quot;There where no repeats&quot;
##############################################
def lemmingmolester():
    cunt = [0,0,0,0,0,0,0,0,0,0]
    str = open(&#39;input.txt&#39;, &#39;r&#39;).read()

    arse = 0
    for char in str:
            arse = ord(char) - 48
            if (arse &gt;= 0) and (arse &lt;= 9):
                    cunt[arse] = cunt[arse] + 1

    for x in range(0, len(cunt)):
            print &quot;There are &quot; + repr(cunt[x]) + &quot; &quot; + repr(x) + &quot;&#39;s&quot;

    if len({}.fromkeys(cunt).keys()) != len(cunt):
            print &quot;There was at least one repeat&quot;
    else:
            print &quot;There were no repeats&quot;
###############################################
def p4plus2():
    file=open(&#39;input.txt&#39;, &#39;r&#39;)
    s=file.read()
    a = [0,0,0,0,0,0,0,0,0,0]
    r = l = len(s)
    for x in range(0, l):
            a[ord(s[x])-48]+=1
    for x in range(0, 10):
            for y in range(0, 10):
                    if(x==y): continue
                    if(a[x]==a[y]): r=1
    print &#39;There are 0 &#39;+repr(a[0])+&quot;&#39;s&quot;
    print &#39;There are 1 &#39;+repr(a[1])+&quot;&#39;s&quot;
    print &#39;There are 2 &#39;+repr(a[2])+&quot;&#39;s&quot;
    print &#39;There are 3 &#39;+repr(a[3])+&quot;&#39;s&quot;
    print &#39;There are 4 &#39;+repr(a[4])+&quot;&#39;s&quot;
    print &#39;There are 5 &#39;+repr(a[5])+&quot;&#39;s&quot;
    print &#39;There are 6 &#39;+repr(a[6])+&quot;&#39;s&quot;
    print &#39;There are 7 &#39;+repr(a[7])+&quot;&#39;s&quot;
    print &#39;There are 8 &#39;+repr(a[8])+&quot;&#39;s&quot;
    print &#39;There are 9 &#39;+repr(a[9])+&quot;&#39;s&quot;
    if(r==1):
            print &#39;There was at least one repeat&#39;
    else:
            print &#39;There were no repeats&#39;
#############################################
def richo():
    file=open(&#39;input.txt&#39;, &#39;r&#39;)

    counts = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    repeats=False
    last = -1

    #Determines how many of each number there are in the string
    for line in file:
        for x in line:
            y = int(x)
            try:
                counts[y] += 1
                if last == y: repeats = True
            except IndexError:# Somethign wasn&#39;t a number
                pass # I don&#39;t care


    #Print results
    for x in range(10):
        print &quot;There are %i %i&#39;s&quot; % (counts[x], x)
    if(repeats==True):
        print &quot;There was at least one repeat&quot;
    else:
        print &quot;There were no repeats&quot;
##################################################
def stdio1():
    import re
    f=open(&#39;input.txt&#39;, &#39;r&#39;)
    a=f.read()
    j=[]
    b=re.compile(str(&#39;0&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;1&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;2&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;3&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;4&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;5&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;6&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;7&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;8&#39;))
    j.append(len(b.findall(a)))
    b=re.compile(str(&#39;9&#39;))
    j.append(len(b.findall(a)))
    if len(j)==len(set(j)):
            print &quot;There are %s 0&#39;s&#92;n&#92;
    There are %d 1&#39;s&#92;n&#92;
    There are %d 2&#39;s&#92;n&#92;
    There are %d 3&#39;s&#92;n&#92;
    There are %d 4&#39;s&#92;n&#92;
    There are %d 5&#39;s&#92;n&#92;
    There are %d 6&#39;s&#92;n&#92;
    There are %d 7&#39;s&#92;n&#92;
    There are %d 8&#39;s&#92;n&#92;
    There are %d 9&#39;s&#92;nThere are no repeats&quot; % &#92;
    (j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
    else:
            print &quot;There are %s 0&#39;s&#92;n&#92;
    There are %d 1&#39;s&#92;n&#92;
    There are %d 2&#39;s&#92;n&#92;
    There are %d 3&#39;s&#92;n&#92;
    There are %d 4&#39;s&#92;n&#92;
    There are %d 5&#39;s&#92;n&#92;
    There are %d 6&#39;s&#92;n&#92;
    There are %d 7&#39;s&#92;n&#92;
    There are %d 8&#39;s&#92;n&#92;
    There are %d 9&#39;s&#92;nThere was atleast one repeat&quot; % &#92;
    (j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
#################################################
def stdio3():
    f=open(&#39;input.txt&#39;, &#39;r&#39;)
    a=f.read()
    j=a.count(&quot;0&quot;),a.count(&quot;1&quot;),a.count(&quot;2&quot;),a.count(&quot;3&quot;),a.count(&quot;4&quot;),a.count(&quot;5&quot;),a.count(&quot;6&quot;),a.count(&quot;7&quot;),a.count(&quot;8&quot;),a.count(&quot;9&quot;)
    if len(j)==len(set(j)):
            print &quot;There are %s 0&#39;s&#92;n&#92;
    There are %d 1&#39;s&#92;n&#92;
    There are %d 2&#39;s&#92;n&#92;
    There are %d 3&#39;s&#92;n&#92;
    There are %d 4&#39;s&#92;n&#92;
    There are %d 5&#39;s&#92;n&#92;
    There are %d 6&#39;s&#92;n&#92;
    There are %d 7&#39;s&#92;n&#92;
    There are %d 8&#39;s&#92;n&#92;
    There are %d 9&#39;s&#92;nThere are no repeats&quot; % &#92;
    (j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])
    else:
            print &quot;There are %s 0&#39;s&#92;n&#92;
    There are %d 1&#39;s&#92;n&#92;
    There are %d 2&#39;s&#92;n&#92;
    There are %d 3&#39;s&#92;n&#92;
    There are %d 4&#39;s&#92;n&#92;
    There are %d 5&#39;s&#92;n&#92;
    There are %d 6&#39;s&#92;n&#92;
    There are %d 7&#39;s&#92;n&#92;
    There are %d 8&#39;s&#92;n&#92;
    There are %d 9&#39;s&#92;nThere was at least 1 repeat&quot; % &#92;
    (j[0],j[1],j[2],j[3],j[4],j[5],j[6],j[7],j[8],j[9])	
##############################################
def stealth():
    file=open(&#39;input.txt&#39;)
    zeroes=0
    ones=0
    twos=0
    threes=0
    fours=0
    fives=0
    sixes=0
    sevens=0
    eights=0
    nines=0
    repeats=0
    for c in file.read():
            if c==&#39;0&#39;: zeroes+=1
            elif c==&#39;1&#39;: ones+=1
            elif c==&#39;2&#39;: twos+=1
            elif c==&#39;3&#39;: threes+=1
            elif c==&#39;4&#39;: fours+=1
            elif c==&#39;5&#39;: fives+=1
            elif c==&#39;6&#39;: sixes+=1
            elif c==&#39;7&#39;: sevens+=1
            elif c==&#39;8&#39;: eights+=1
            elif c==&#39;9&#39;: nines+=1
    counts=[zeroes, ones, twos, threes, fours, fives, sixes, sevens, eights, nines]
    a = range(10)
    for x in a:
            for y in a:
                    if(x==y): continue
                    elif(counts[x]==counts[y]): repeats=1; break
    for x in a: print &quot;There are %d %d&#39;s&quot; % (counts[x],x)
    if repeats: print &quot;There was at least one repeat&quot;
    else: print &quot;There were no repeats&quot;
##############################################
def SwiftNomad():
    import re
    numbers = dict()
    numbers = {&#39;&#92;n&#39;: 0, &#39;0&#39;: 0, &#39;1&#39;: 0, &#39;2&#39;: 0, &#39;3&#39;: 0, &#39;4&#39;: 0, &#39;5&#39;: 0, &#39;6&#39;: 0, &#39;7&#39;: 0, &#39;8&#39;: 0, &#39;9&#39;: 0 }
    file = open(&#39;input.txt&#39;, &#39;r&#39;)
    str=file.read()
    str = re.sub(&quot;[^0-9]&quot;, &quot;&quot;, str)
    repeats = False
    total = len(str)
    for x in range(0, total):
     numbers[str[x]] += 1
    for x in numbers:
     for y in numbers:
      if(x == y):
       continue
      elif(numbers[x] == numbers[y]):
       repeats = True
    print &#39;There are %d 0&#92;&#39;s&#39; % (numbers[&#39;0&#39;])
    print &#39;There are %d 1&#92;&#39;s&#39; % (numbers[&#39;1&#39;])
    print &#39;There are %d 2&#92;&#39;s&#39; % (numbers[&#39;2&#39;])
    print &#39;There are %d 3&#92;&#39;s&#39; % (numbers[&#39;3&#39;])
    print &#39;There are %d 4&#92;&#39;s&#39; % (numbers[&#39;4&#39;])
    print &#39;There are %d 5&#92;&#39;s&#39; % (numbers[&#39;5&#39;])
    print &#39;There are %d 6&#92;&#39;s&#39; % (numbers[&#39;6&#39;])
    print &#39;There are %d 7&#92;&#39;s&#39; % (numbers[&#39;7&#39;])
    print &#39;There are %d 8&#92;&#39;s&#39; % (numbers[&#39;8&#39;])
    print &#39;There are %d 9&#92;&#39;s&#39; % (numbers[&#39;9&#39;])
    if(repeats==True):
            print &quot;There was at least one repeat&quot;
    else:
            print &quot;There were no repeats&quot;
#################################################
def ynori7():
    file=open(&#39;input.txt&#39;, &#39;r&#39;)
    st=file.read()

    ar=[]
    ar+=[st.count(&quot;0&quot;)]
    ar+=[st.count(&quot;1&quot;)]
    ar+=[st.count(&quot;2&quot;)]
    ar+=[st.count(&quot;3&quot;)]
    ar+=[st.count(&quot;4&quot;)]
    ar+=[st.count(&quot;5&quot;)]
    ar+=[st.count(&quot;6&quot;)]
    ar+=[st.count(&quot;7&quot;)]
    ar+=[st.count(&quot;8&quot;)]
    ar+=[st.count(&quot;9&quot;)]
    ar=map(str, ar)
    if(len(set(ar))&lt;10):
            out = &quot;There was at least one repeat&quot;
    else:
            out = &quot;There were no repeats&quot;
    out=&quot;&quot;.join(map(lambda b,n:&quot;There are &quot;+b+&quot; &quot;+n+&quot;&#39;s&#92;n&quot;, ar, map(str, range(0, 10))))+out
    print out
#################################################
    
if __name__==&#39;__main__&#39;:
    from timeit import Timer
    times={}
    t = Timer(&quot;original()&quot;, &quot;from __main__ import original&quot;)
    times[&quot;Original&quot;]=t.timeit(100)
    t = Timer(&quot;COM()&quot;, &quot;from __main__ import COM&quot;)
    times[&quot;COM&quot;]=t.timeit(100)
    t = Timer(&quot;ynori7()&quot;, &quot;from __main__ import ynori7&quot;)
    times[&quot;ynori7&quot;]=t.timeit(100)
    t = Timer(&quot;stealth()&quot;, &quot;from __main__ import stealth&quot;)
    times[&quot;stealth-&quot;]=t.timeit(100)
    t = Timer(&quot;stdio1()&quot;, &quot;from __main__ import stdio1&quot;)
    times[&quot;stdio1&quot;]=t.timeit(100)
    t = Timer(&quot;stdio3()&quot;, &quot;from __main__ import stdio3&quot;)
    times[&quot;stdio3&quot;]=t.timeit(100)
    t = Timer(&quot;p4plus2()&quot;, &quot;from __main__ import p4plus2&quot;)
    times[&quot;p4plus2&quot;]=t.timeit(100)
    t = Timer(&quot;Defience1()&quot;, &quot;from __main__ import Defience1&quot;)
    times[&quot;Defience1&quot;]=t.timeit(100)
    t = Timer(&quot;Defience2()&quot;, &quot;from __main__ import Defience2&quot;)
    times[&quot;Defience2&quot;]=t.timeit(100)
    t = Timer(&quot;SwiftNomad()&quot;, &quot;from __main__ import SwiftNomad&quot;)
    times[&quot;SwiftNomad&quot;]=t.timeit(100)
    t = Timer(&quot;false()&quot;, &quot;from __main__ import false&quot;)
    times[&quot;false&quot;]=t.timeit(100)
    t = Timer(&quot;richo()&quot;, &quot;from __main__ import richo&quot;)
    times[&quot;richo&quot;]=t.timeit(100)
    t = Timer(&quot;lemmingmolester()&quot;, &quot;from __main__ import lemmingmolester&quot;)
    times[&quot;lemmingmolester&quot;]=t.timeit(100)
    
    items=times.items()
    items.sort(lambda x,y: cmp(x[1], y[1]))
    for x in items:
        print str(x[0])+&quot; : &quot;+str(x[1])

ghost's Avatar
0 0

Well, I failed horribly. :( Oh well, it was a fun way to learn a little Python. Congrats to the winners!


ghost's Avatar
0 0

Congrats to the winners! lemmingmolester, your choice of words cracks me up :D


ghost's Avatar
0 0

Woot

Anyways thanks for running this. It seems that all of the top speed ones used count() – after trying many other ways, this did seem to be the best.


ynori7's Avatar
Future Emperor of Earth
0 0

stdio wrote: Anyways thanks for running this. It seems that all of the top speed ones used count() – after trying many other ways, this did seem to be the best. Yeah, though I like some of the other solutions too even though they didn't win. I particularly like false's solution because I never would have thought of using a method like that. As for my own solution, I'm surprised it did as well as it did since I was just trying to do something different than everyone else.

Anyway, thanks to everybody who participated. The winners will get their prizes as soon as I'm able to contact Mr Cheese.


korg's Avatar
Admin from hell
0 0

Nice job to all!