Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

Week 5 lockdown: data science upskilling

A quick search of my Twitter timeline reveals that at least once a year I declare that I intend to teach myself Python (spoiler alert – I still haven’t!). I learnt to code in MATLAB and picked up bash scripting and R along the way. As my work becomes more data science based and the data I handle becomes larger and larger, I feel that Python might better suit my needs. It is also the most widely used language by data scientists in industry, so opens up more job prospects beyond academia.

One of the challenges of learning a new programming language is that something you could do in a few minutes, can suddenly take a day when you are unfamiliar with the syntax and idiosyncrasies of the new language. However, there is a thriving online community of coders offering courses and getting together to learn and that helps. @russpoldrack has collated a useful list of courses to get started. Many people in lockdown are using the extra time they have from not commuting, running experiments or sitting in meetings, to gain new data science skills.

As I start two new projects, I have decided to commit to using Python to do all data cleaning and analysis. It will slow me down now, but it’s a worthwhile investment in my skills and my career.

Week 4 lockdown: writing productivity for data scientists

Many of us are drawn to data science because we love investigating: playing with data, and finding undiscovered patterns that tell a story. It’s easy to get caught up in the data and put off communicating that story to our peers and the public. This week again, I have found writing to be a useful lockdown activity.

'You can always edit a bad page. You can’t edit a blank page'. I love this quote by the writer Jodi Piccoult; in fact it sums up my writing strategy too.

A few years ago I discovered the 'write first, edit later' method for writing productivity, and it has changed the way I approach writing. I no longer agonise over the opening sentence. I just put words down on the page. Any words. Five hundred of them a day to be precise. I set this limit because it is a specific goal and it forces me to get something down on the page. Sometimes I write 1000 words, sometimes I fill up 500 words with ‘easier’ sections like the methods or results (never references though, references don’t count!). A blank page is intimidating. It’s hard to imagine going from nothing to the finished product. It’s amazing how quickly 500 words a day becomes something that looks like a journal article, and that is really motivating.  I have maintained a consistent writing schedule throughout the lockdown and should have two papers completed by the end of the month. 

I have let go of the illusion that I need to be in the right mood, the right environment or have exactly the right words. One advantage of sharing my time between childcare and work is that the rare times I have uninterrupted time to work, I don’t procrastinate. No more excuses, just words.

Dr Michele Veldsman offers data scientists writing productivity advice

Week 3 lockdown: synchronised editing software

As is the case for so many researchers, my research studies demand close collaboration. But I am increasingly frustrated with collaborative writing that involves sending a Microsoft Word document back and forth with ever increasing non-sensical filenames. Synchronised editing seems to be a useful lockdown activity, so, since I am writing two manuscripts in parallel, for this diary entry I have decided to pit two online collaborative editors against each other: Authorea and Overleaf.

Getting to know synchronised editing software

Both are LaTeX editors but the degree of technical difficulty between them is substantially different: no real knowledge of LaTeX syntax needed for Authorea, but familiarity of LaTeX needed for all but the most basic functions in Overleaf. The advantages of both are the dynamic real-time editing and collaborative tools, easy embedding of inline formulas and images, version control, and automatic renumbering of sections (no more struggling to figure out how to get an image to sit in the right place or worrying about section 1.2.2.2 coming after 1.2.2.1!)

My preference so far has been Authorea, which has excellent in-text citation tools that do not require you to have a synced reference manager or BibTeX attached. The ease of use of Authorea works well for collaborators who do not have LaTeX experience and the ability to submit directly to preprint archives and journals is a huge bonus. I have found the occasional instability of the internet during the lockdown to be a bit concerning when writing on a live manuscript editor as you rely on the internet connection to sync your work into their cloud-based systems. It’s strange not to have a local copy, but the flip side is that your work is available everywhere and saved frequently with full version history.

Of course, collaborative writing is only as useful as the words you manage to get on the page; I’ll tackle that next week.

Dr Michele Veldsman is working with Lisa Nobis and Petya Kindalova at Oxford University to analyse hippocampal volumes and white matter hyperintensity distributions in UK Biobank data with some exciting new methods.

This week Michele read this Nature article on synchronised editing and collaborative writing.

Week 2 lockdown: graphical skills

In lockdown week two I’ve been using a guide designed by Dr Zoe Ayres to help scientists who have suddenly found themselves without a lab. I already have a schedule, daily time with my family and regular contact with my colleagues in place, so this week I’m concentrating on something that I and many others often sideline: making beautiful figures.

Skill development: graphics and figures

I believe an effective figure can tell most of the story in scientific articles. I have not yet found a single piece of software that caters to all my needs, so I usually use some combination of R Studio, MATLAB, Keynote and Inkscape to make publication-ready figures. The DPUK Data Portal has many of the tools I need to create figures already installed (R and MATLAB for example) meaning I don’t need to waste time sourcing university licenses for my home computer.

The Fundamentals of Data Visualisation by Claus Wilke is my figure making bible. It has taught me how to make figures which not only accurately convey my data, but which are also visually pleasing. Colour is one of the strongest tools we have in data visualisation so I take care to choose colour palettes that best reflect the underlying data. I use colobrewer, Medilab’s ‘i want hue’ web tool, and Fabio Crameri’s scientific colour maps, all of which have colour vision-deficient friendly palettes. I find making figures a good way to distract myself if I get overhwhelmed by the difficult situation we are in or I am stuggling to focus on writing.

I am learning to prioritise my mental health and wellbeing this week. If I am having a hard day, doing something I enjoy, like trying out some new software or improving my data visualisation skills, makes me feel like I am still making progress without too much additional mental load.

 Scientist without a lab_Dr Ayres.jpg

Week 1 lockdown

It’s my first week in lockdown with my 2 and a half year old daughter, Talia, home with us all day.

Monday was chaos – my husband and I had online meetings all day and it was impossible to do anything other than answer a few emails. But thankfully now (Thursday) things have started to settle down. My focus at the moment is finishing two papers that use data from UK Biobank. I'm looking at how the brain changes as we age and how an unhealthy lifestyle, or certain genes, accelerate these brain changes. This might help us understand how things like high blood pressure or smoking increase the risk of dementia. At the moment, I'm feeling very grateful to have data like UK Biobank to work with as it means my research is not too disrupted. Our lab usually tests older volunteers and dementia patients but that had to be put on hold very quickly as the coronavirus situation rapidly escalated. 

My husband works all afternoon, while I look after Talia, and we both work again after she has gone to bed. We have to be flexible though, as kids are very unpredictable! We do Cosmic kids yoga and online ballet classes, combined with ‘races’ around the garden – that’s about all the daily exercise I can fit in!

I struggle with anxiety and depression, so this week I have used mindfulness to help me cope and Twitter is still my lifeline to the outside world and academic community. I have found Slack and Zoom invaluable for keeping in touch with colleagues. It's also my way of ensuring that the students I work with and mentor are coping. I have learnt to delegate more this week, and colleagues have been very kind in offering to take things off my hands. Although there is a lot of uncertainty at the moment, I think this experience will make our academic community stronger and give us a sense of what is most important in life.

Dr Michele Veldsman is a postdoctoral researcher in cognitive neurology at Oxford University.