Monthly Archives: February 2014

Rushing here and there: planning an itinerary for a large scientific meeting

I’ve just returned from the of the in San Francisco. With around 7,000 scientists, multiple simultaneous sessions of talks and nearly a thousand posters every day, it is a large event, but not as big as many. Even so, working out what talks and posters you might want to see is a difficult task. Of course, you might not wish to prepare a schedule as this is somewhat of a personal thing, but I find it helpful just to know how the days will ebb and flow – if today is busy, will tomorrow be a bit quieter and let me recover? If I bump into someone I want to talk to, what will I miss? What posters might be interesting on the other side of the exhibition hall? I emphasise that attending interesting talks and seeing posters are not the only things one does at these conferences – talking to people is useful and fun too – but having that side of it organised does, I find, take your mind off “the next thing�?.

Although this was my fifth , I feel it was the first time I was adequately organised, and believe me, I’ve been trying. Before I describe what I found worked for me this time, I’ll quickly describe the options available in the order you encounter them in the run-up to the meeting itself.

Options

A useful tool this. You can search by presenter, keyword etc and then add selected talks and posters to your itinerary that is saved against your user. I usually search by both scientific labs (i.e. surname of the group leader) and keywords in the title or abstract. Worth starting several weeks in advance and coming back to as you tend to remember topics or groups you’d forgotten the first time. Better still, you can download your saved itinerary into your calendary program of choice. Unfortunately this is poorly implemented. For example, say you’ve picked out 30 posters on one day (very easy to do). The planner creates an “appointment�? for every single poster session at the same time in your calendar. Then imagine you have a default alarm setting, as many people do, and finally picture the mayhem when your smart phone / laptop / tablet tries alerting you to 30 simultaneous events. The talks don’t fare much better: even if you only pick out one 15 minute talk the planner puts the whole 2 hour session into your calendar. Not helpful.

. On the first day of the conference you get a copy of the program as a soft-bound book, but this list is in chronological order so we haven’t got the book in our hands just yet. But you can see the program before you start travelling as an . Unfortunately this is one of those “worst-of-both-worlds�? things: it is on the screen of my computer but it wants me to flick the pages? And then it makes a flicking noise to show the pages are going turned over? In other words, the UI slows you down in an effort to make you think it is a book. I would love to know if anyone actually used this in the intended way. Fortunately you could download the whole book as a PDF, although finding the correct button took a little searching.

An innovation this year. It let you search for sessions on your device and “check in�? in-a -social-way to show what session you were in. This boiled down to me seeing what sessions people I didn’t know where in. Not very helpful. The idea is good, but it wasn’t written from the point of view of someone trying to navigate the myriad of sessions and thousands of posters whilst heavily jetlagged. Didn’t use it much so won’t write any more.

4. Paper version program. The old standby. Even without the abstracts it weighs in at a hefty 298 pages – some laptops are lighter. The traditional approach to meeting planning is to leaf through the program at breakfast in your hotel drawing circles around the talks you want to go to. I wouldn’t to try and go through the posters this way though. Solid and dependable but you don’t get until you register at the convention centre so you have to be pretty speedy if you want to develop a schedule more than a day in advance.

5. Going with the flow. Perhaps the easiest method: just follow other people who have similar interests to you, or find the room where the clapping seems loudest. Not very reliable but requires little in the way of preparation. Maybe in the coming years social media, like Twitter, will allow you gauge where to go, in real-time, as the conference progresses. Bizarrely (in my mind) most scientists are extremely conservative when it comes to social media and so, despite having a hashtag (#bps14) and , there were only 518 tweets making use of the hashtag (and exactly half of these came from three accounts). On average this is a single tweet for every 13 scientists in attendance over five days of the conference. For social media such as Twitter to provide an evolving picture of the conference, for example to show which talks appear especially interesting, you need a rapidly updating timeline, say a tweet a minute, which works out at 600 tweets a day, or 3,000 tweets over the course of the conference. So I’d say we are still someway from any kind of social media “tipping point�? that would let you go with the flow.

My recipe

Again, I stress this is what worked for me this year so probably won’t work for you, and maybe won’t even work for me another year, but hopefully will give you some ideas or get you thinking about how to organise an itinerary.

(a). Search the online itinerary planner in advance. I make a list on a piece of paper of keywords and groups that I am interested in. Search on each term and add appropriate talks and posters to your itinerary – be selective and don’t just add all the posters from one group that you admire. Put a line through each term when you’ve searched for it! Come back to your list several times as you’ll find you come up with new groups or keywords to search on. Don’t add it your calendar! Instead, save the list, perhaps as a PDF.

(b). Search through the program. Download the program book as a PDF, don’t use the online version. Now go through the talks day-by-day and highlight any that look interesting (this is especially easy on an iPad using an App such as ). Don’t worry if you highlight a few that are in your itinerary.

(c). Add talks to your calendar manually (see right). This takes a bit of time, and is a bit dull (and so is perfect for doing on the aeroplane). Go through the program and add talks to your calendar (iCal, Google Calendar, Outlook etc), remembering to include the room number. You can copy the text off the program PDF and paste it into your calendar to cut down on typing. Setting a default alarm can be helpful here as it can easily take 5 minutes to get from one room to another. Now go through the saved version of your online itinerary and add in any talks you’ve missed. It is also helpful to make a list of the posters you’ve picked out for each day at the same time so you can walk from one to the other in the Exhibition Hall.

(d). Look for patterns. Chances are you’ll have quite a few clashes. You’ll also find that you’ve picked out say 3 talks from a session of 8 talks in total. If so, it might be worth getting a coffee and staying for the whole session. Or you might choose to duck out and see another talk in another session and then come back. I prefer to wait until the day before deciding exactly which sessions to hit and how to deal with my clashes as what I decide will depend on where other people are going and, crucially, just how far it is between Room X and Room Y which you won’t find out until you are in the Convention Centre itself.

(e). Go with the flow in an organised way. Having put in this work to organise my schedule and put it in my calendar on my iPad, I was able to go from talk to talk without feeling a tiny bit panicked that I was missing something a crucial session. Having a list of poster numbers for each day was hugely helpful too as a cluster of posters together would indicate that that section would be interesting. For me the crucial point is that all this organisation freed me up during the conference. For example I bumped into several people as I was leaving a session but because I had a good idea of what I had next and how interesting it was I was able to decide whether to talk to them, perhaps over a coffee, or whether I really should hit that talk. It also meant I recognised that Monday, for example, was going to be a really tough day but Tuesday would be easier and then Wednesday, whilst shorter, was also going to be busy. In the end I left the big book in my hotel and just carried around my iPad which was much easier. I guess the final things I found helpful are: if you think of something or meet someone write it down immediately. You are busy, tired and probably also jetlagged – chances are you will not remember that idea, person or reference in a days time, let alone when back in the lab. Personally, I use for everything as my notes are synced across all my devices so if, for example, the battery on my iPad runs out, I just swap to the laptop or whatever I have with me.

That was my recipe this year at the . It only took me five goes to get there (and some improvements in technology since I first went) but I feel I learnt a lot more, met a lot more people and had a lot more ideas than the previous four times I have attended. Take my recipe with a pinch of salt as it might not work for you. I am sure though that attending these large meetings can be made to be more manageable; you just have to find what works for you.

GROMACS 4.6: Running on GPUs

I mentioned before that I would write something on running on GPUs. Let’s imagine we want to simulate a solvated lipid bilayer containing 6,000 lipids for 5 µs. The total number of is around 137,000 and the box dimensions are roughly 42x42x11 nm. Although this is smaller than the benchmark we looked at last time, it is still a challenge to run on a workstation. To see this let’s consider running it on my MacPro using GROMACS 4.6.1. The machine is an and has 2 Intel Xeons, each with 4 cores. Using 8 MPI processes gets me 132 ns/day, so I would have to wait 38 days for 5 µs. Too slow!

 

You have to be careful installing non-Apple supported NVidia GPUs into MacPros, not least because you are limited to 2x6pin power connectors. Taking all this into account, the best I can do without doing something drastic to the power supply is to install an . Since I only have one GPU, I can only run one MPI process, but this can spawn multiple OpenMP threads. For this particular system, I get the best performance with 4 threads (134 ns/day) which is the same performance I get using all 8 cores without the GPU. So when I am using just a single core, adding in the GPU increases the performance by a factor of 3.3x. But as I add additional cores, the increase afforded by the single GPU drops until the performance is about the same at 8 cores.

Now let’s try something bigger. Our lab has a small Intel (Sandy Bridge) computing cluster. Each node has 12 cores, and there are 8 nodes, yielding a maximum of 96 cores. Running on the whole cluster would reduce the time down to 6 days, which is a lot better but not very fair on everyone else in the lab. We could try and get access to Tier-1 or Tier-0 supercomputers but, for this system, that is overkill. Instead let’s look at a Tier-2 machine that uses GPUs to accelerate the calculations.

 

fig-blog-gpu-5

The University of Oxford, through , has access to the machines owned by the Centre for Innovation. One of these, , is a GPU-based cluster. We shall look at one of the partitions; this has 60 nodes, each with two 6-core Intel processors and 3 NVIDIA M2090 Tesla GPUs. For comparison, let’s run without the GPUs. The data shown are for simulations with only 1 OpenMP thread per MPI process. So now let’s run using the GPUs (which is the point of this cluster!). Again just using asingle OpenMP thread per MPI process/GPU (shown on graph) we again find a performance increase of 3-4x. Since there are 3 GPUs per node, and each node has 12 cores, we could run 3 MPI processes (each attached to a GPU) on each node and each process could spawn 1, 2, 3 or 4 OpenMP threads. This uses more cores,  but since they probably would be sitting idle, this is a more efficient use of the compute resource. Trying 2 or 3 OpenMP threads per MPI process/GPU lets us reach a maximum performance of 1.77 µs per day, so we would get our 5 µs in less than 3 days. Comparing back to our cluster, we can get the same performance of our 96-core local cluster using a total of 9 GPUs and 18 cores on EMERALD.

Finally, let’s compare EMERALD to the Tier-1 PRACE supercomputer CURIE. CURIE was the . For this comparison we will need to use a bigger benchmark, so let’s us . It has 9x the number of lipids, but because I had to add extra water ends up being about 15x bigger at 2.1 million particles. Using 24 GPUs and 72 cores, EMERALD manages 130 ns/day. To get the same performance on CURIE requires 150 cores and ultimately CURIE tops out at 1,500 ns/day on 4,196 cores. Still, EMERALD is respectable and shows how it can serve as a useful bridge to Tier-1 and Tier-0 supercomputers. Interestingly, CURIE also has a “hybrid�? partition that contains 144 nodes, each with 2 Intel Westmere processors and 2 NVIDIA M2090 Tesler GPUs. I was able to run on up to 128 GPUs with 4 OpenMP threads per MPI/GPU, making a total of 512 cores. This demonstrates that GROMACS can run on large numbers of GPU/CPUs and that such hybrid architectures are viable as supercomputers (for GROMACS at least).