Pages: [1]
benher
BAM!ID: 7921
Joined: 2006-10-06
Posts: 3
Credits: 465,152
World-rank: 318,412

2006-10-06 22:12:47
last modified: 2006-10-06 22:13:03

Parse all the host CPU names into a logical, short format.

Allow sorting by some of the fields from the new short format.

You could put checkboxes or radio buttons to allow consolidating the list by levels of precision.

Example: [x]Vendor [x] Brand [x] Generation [x] Model [x] Speed

So a visitor could see Intels vs AMDs
Or Intel P4s vs AMD K8s.

Could also add boxes to filter by model name or Mhz speed to compare intel xxx speed vs amd yyy speed.

Host ID # and parsed string can be stored in your local DB. Would not need to re-parse host name unless some trigger idetified original host name changed (not often).

For example there are currently 6 different entries (see end of post) for Athlon 64 X2 3800+ cpus.

I propose consolidating them so that there is only one for all "Athlon 64 X2" type processors, and then sub categories based on the xx00+ speed.

Other examples...
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
becomes AMD, K9, Athlon 64 X2, 3800+

Intel(R) Pentium(R) 4 CPU 2.00GHz
becomes Intel, P4, Pentium-IV, 2000MHz

GenuineIntel x86 Family 6 Model 8 Stepping 6 943MHz
becomes Intel, P3, Pentium-III, 943MHz
(you could even add the core for this one (Coppermine) )

6 different listings for same CPU.
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ 5,820 80,136,813.01 306,865.53 13,769.21 52.73
AMD Athlon(tm) 64 X2 Dual Core 3800+ 29 495,000.35 2,146.12 17,068.98 74.00
AMD Athlon(tm) 64 X2 Processor 3800+ 21 223,642.14 1,067.24 10,649.63 50.82
AMD Athlon(tm)64 X2 Dual Core Processor 3800+ 118 1,600,567.40 7,263.83 13,564.13 61.56
AMD Athlon(tm) 64 X2 Dual Core Processor 3800 19 103,219.56 1,042.39 5,432.61 54.86
Honza
BAM!ID: 109
Joined: 2006-05-10
Posts: 154
Credits: 8,928,643,847
World-rank: 435

2006-10-06 22:30:45

I'm also for showing only one and unique host CPU name per procesor as you have showed in you example.

Those values are stored in each of project's dbs.
I believe Willy can do to script that will merge names but the source is on each project so it's not remedy but only cosmetical adjustement.

I would be nice to know original of this behaviour.
Each platform gives slightly different name? I believe not.
Different BOINC version over the time? Likely.

Perhaps a better place is to address issue on dev list...

btw, making an extra table with those names and relating it to master db should have saved considerable amout of db size I guess.
benher
BAM!ID: 7921
Joined: 2006-10-06
Posts: 3
Credits: 465,152
World-rank: 318,412

2006-10-26 22:25:49
last modified: 2006-10-26 22:44:23

Willie,

If it would be helpfull, I've already written C++ source to parse through converting one string type to the other -- tested on a download of the hosts.gz file from Seti stats page.

Initially the whole 1Gig of stats would have to be gone through once.

Additional updates to your DB from the daily stats would only have to be made for systems with a last contact date > last time you updated your DB. What would normally be called "active hosts". There are a LOT of hosts non-active in the current "project by CPU" lists.

I chose the format [vendor] [generation] [model] because it would sort well, and include P3 core Celerons amongst P3s and P4 Celerons with the other P4s.

The groupings of MHz speeds would, of course, only use the "listed" Mhz speed of the CPU. People who overclock them would still have them listed in the original "manufactured speed" speed group. This would also do away with groupings of CPU off by 2 or 3 MHz sorted in different entries (when the viewer chooses to group by..say... 10s of MHz)

Also - I have submitted CPUID code that checks CPUs (low level hardware, core name, actual MHz speed testing - not string parsing as I'm describing here) and outputs a format similar to what I've described to Eric Korpela. He has expressed interest in the past in incorporating this into BOINC. So a later release of BOINC may well ouput this type of format.
Honza
BAM!ID: 109
Joined: 2006-05-10
Posts: 154
Credits: 8,928,643,847
World-rank: 435

2006-10-27 07:11:40

That's a great news - I hope it will be used soon.

I guess the final host.gz file got a bit smaller, right?
benher
BAM!ID: 7921
Joined: 2006-10-06
Posts: 3
Credits: 465,152
World-rank: 318,412

2006-10-27 15:28:41

That's a great news - I hope it will be used soon.

I guess the final host.gz file got a bit smaller, right?


The current host.gz file is about 147MB - containing a 1Gig XML file.

The new proposed strings would be slighly shorter than existing (in some cases), but the majority of the file contents are XML tags, so the file size wouldn't get smaller.

Database storage by boinc-stats might get smaller using indexes and such.
Honza
BAM!ID: 109
Joined: 2006-05-10
Posts: 154
Credits: 8,928,643,847
World-rank: 435

2006-10-27 15:46:59
last modified: 2006-10-27 15:47:38

Database storage by boinc-stats might get smaller using indexes and such.
Yes, this is what I meant.
Pages: [1]

Index :: Comments and suggestions :: Consolidating all of the CPU and project by CPU statistics.
Reason: