Two weeks ago distributed.net announced the beginning of its new project, OGR-27. This project will verify that the current shortest-known Golomb ruler with 27 marks is optimal (i.e. the shortest possible), or it will find the optimal 27-mark ruler. Bovine, a leader of the project, stated that “we are confident that we will discover a better ruler for OGR-27 than the one we know to be optimal currently.” He also stated that the project will take about seven years to complete.

So, how do you get excited about a project that will take so long to complete? Seven years is a long time. You may be in a different decade of your life seven years from now. You may be married (or divorced) seven years from now. You may have children (or your children may have grown up and left home) seven years from now. You may be in a different state (or province or country), a different job, even a different career. You will almost definitely finish the project on a different computer than the one on which you began it. In seven years you may not care about Golomb rulers, let alone finding the optimal 27-mark ruler. How do you commit to a project when its end is so far in the future?

Here are several reasons to get excited:

1. You know when the project will end. Because of the nature of Golomb rulers, there is no way to know exactly how many rulers there are for a given number of marks until you have found all of them. For its first OGR project, OGR-24, distributed.net had no way to estimate how much work needed to be done for the project so there was no way for participants to know how much of the project was complete or when it might finish. For OGR-25 the project owners discovered that while they couldn’t know the total number of rulers they would need to test, they did know the total number of “stubs,” or beginnings of rulers, they would need to test. Work units for each OGR project are individual stubs, such as 27/4-8-35-45-24* (which my computer is working on right now). They could show a project’s progress as the total number of stubs completed compared to the total number of stubs for the project. It’s harder to be excited about a project when you have no idea when it will complete. It’s easier to be excited when you know roughly when it will end, even if that end is several years away.

2. The project may end sooner. In the next four years most OGR-27 participants will have upgraded their computers and will have at least twice as much computing power as they do now. More participants may join the project as they see it move closer to completion. More people may become interested in Golomb rulers within the next seven years and may join the project. Some distributed computing teams may get into a stats competition for the project and may temporarily give it a boost in computing power. Assuming the project leaders did not factor these potential growths in computing power into their time estimate, the project could end in six or even five years.

3. You’re contributing to something new. No one has found the optimal 27-mark Golomb ruler before. You’re helping to make a new discovery, helping to make history. You’re also contributing to something big. Until a few years ago it would not have been practical to attempt a project of this size. Seven years is an easier length of time to commit to than 14, or 20.

4. Larger goals give you a greater sense of accomplishment. A runner can feel a sense of accomplishment from completing a 5K race. But he or she can feel a much more significant sense of accomplishment from completing a marathon. 11 years ago I set a lifetime goal for myself to walk 25,000 miles, the equivalent of a walk around the world. I created a Walk Around the World website to help me track my progress toward that goal (the site is also helping almost 300 other people set and reach their walking goals). At my starting rate of 600 miles per year, I set myself a goal that I could not complete for at least 40 years. Will I feel a big sense of accomplishment 30 years from now when I reach that goal? You bet I will! And you will feel a greater sense of achievement in participating in a seven-year-long project than you will participating in a seven-month-long project.

5. Your participation matters. You may only be one of several thousand participants in OGR-27. You may only have one CPU to contribute, compared to another participant’s server farm with hundreds of CPUs. You may only participate in the project for one week each year. But each work unit you complete brings the project a few more minutes or hours closer to completion. Each work unit you complete rules out several billion possible rulers. Each work unit you complete has a chance of finding the optimal ruler, of creating brand new knowledge.

It may be hard to get excited about a project which won’t end for seven years, but that shouldn’t discourage you from participating. My computer is currently testing about 55.6 million OGR-27 rulers every second. While I wrote this entry the computer tested over 400 billion rulers. Those rulers are an almost immeasurably small percentage of the total rulers which need to be tested. They only removed two hours of computing time needed to complete the project. But, except for a double-check by another participant, those rulers won’t have to be tested again. It wasn’t exciting. It is satisfying.

Advertisements

What is distributed computing? And why should I care about it?

Distributed computing is a computing technique which splits a large problem into small pieces and gives the pieces to several computers, allows the computers to solve their pieces at the same time, then combines the results from all of the pieces into a result for the entire problem. See a Wikipedia article for a more detailed description of it.

So what? That sounds pretty boring.

What distributed computing is is mostly interesting to computer scientists and enthusiasts. What it does, or what it can be used for, should be interesting to everyone.

Imagine a task that is too large to be completed by one person within a reasonable amount of time. Perhaps you need to hand-write 1,000 copies of a letter and each copy requires at least one hour to write. If you have to write the letters by yourself, you will need to write for 1,000 hours or almost 42 days to finish the task. Maybe you need to prepare 10,000 care packages to send to victims of a hurricane and each package requires 5 minutes to assemble. If you are preparing those packages by yourself, you will need to work for 833 hours or almost 35 days to complete the task.

Completing these tasks would be so daunting that no one would try to complete them by himself or herself. But if the letter-writer recruited 50 volunteers and each volunteer wrote 20 letters, the task could be completed in 20 hours–less than one day. If the package preparer recruited 100 volunteers and each volunteer prepared 100 packages, the task could be completed in a little more than 8 hours. If the preparer recruited 200 volunteers and each volunteer prepared 50 packages, the task could be completed in 4 hours.

Now imagine a problem that can be solved with computing techniques but that is still too large to be solved by one computer within a reasonable amount of time.

Maybe you’re a mathematician who wants to discover new Mersenne prime numbers larger than any that are currently known. Or you want to solve the Sierpinski problem, a mathematical problem which was posed in 1967 and which has not yet been solved. Maybe you’re a meteorologist who wants to test a global warming theory by predicting what Earth’s climate will be like 50 years from now. Perhaps you’re a biologist who wants to learn how linear chains, or “strings,” of amino acids fold into three-dimensional proteins (the building blocks of life) by simulating the process in a computer program, or you want to test a large set of molecules to see whether any of them would be a safe and effective drug to fight AIDS or cancer. Maybe you’re an art student who wants to create a film with computer animation, like Pixar‘s “Toy Story.”

If you’re the mathematician you may need to test several thousand numbers to see if any of them are primes. Each number may take between several hours and several months to test. If you’re the meteorologist you may need to generate thousands of climate prediction simulations, each of which takes two months or more to complete. If you’re the biologist you will need several months or years to complete a single protein folding simulation or you will need to test millions of potential molecules, with each test requiring several hours or days of computing time. If you’re the art student, you will need to render or “draw” thousands of computer-generated images to assemble into a film. Each image may take many hours to many days to render.

If you only have one computer to do the work for any of these problems, it will take longer than your lifetime to solve that problem. If you only have one supercomputer, it will still take several decades to solve the problem. No one would attempt to solve these problems because solving them would not be practical. But if you have a network of thousands or millions of computers to do the work, and if you can break the work into small enough pieces so that each computer can finish a piece within a few hours or days, you can solve the problem very quickly–within a few days, months or years.

Thanks to advances in distributed computing research over the past 15 years, all of the computing problems described above can now be researched in a practical way, and all of them are being researched right now. Plus, all of them are public projects in which anyone may participate.

GIMPS, the Great Internet Mersenne Prime Search, has discovered the eight largest known Mersenne prime numbers in the project’s 13 years, including the first known prime with more than ten million digits. Seventeen or Bust has found 11 of the final 17 primes that need to be found to solve the Sierpinski Problem. climateprediction.net has recruited volunteers from all over the world to complete over 382,000 climate simulations (each simulation requires a month or more of computation to complete) from which it can create an average prediction of the most likely climate 50 years from now. Folding@home has recruited a network of volunteers with their PCs, PC graphics cards and PlayStation 3 game consoles to build a computing network with 5 PetaFLOPs of computing power (5 times more powerful than the average supercomputer) and has created more accurate protein folding simulations and has begun to learn more about how mis-folds can cause diseases like Alzheimer’s. World Community Grid‘s Help Conquer Cancer project has tested over 33 million potential cancer-fighting drug molecules. BURP, the Big and Ugly Rendering Project, is helping amateur movie makers and computer graphics researchers render entire movies or complex graphics sequences in the time that they could personally only render a few images.

What is distributed computing? It is a technique which, using volunteers around the world and their millions of computing devices connected by the Internet, enables types of research which have never before been possible. It is a technique which will permanently affect the way scientific research is done. It is a technique which will allow humanity to make discoveries greater than we can imagine.

Welcome to my blog about distributed computing projects and related topics. I became interested in distributed computing in 1999 when a friend told me about the distributed.net project over lunch one day. My fascination with the subject has grown ever since. In 2000 I created what is now the distributedcomputing.info website, which tracks distributed computing projects in which the public can participate. If you are not familiar with distributed computing, please study that site first. I have created this blog to discuss distributed computing projects, the research they are doing, the things they have achieved (or not), distributed human projects (in which you do the work instead of your computer), and topics related to distributed computing.