Image for post
Image for post
Photo by Luke Chesser on Unsplash

As a biologist, I often deal with data. No matter how fluent I am in programming, deciding the steps for my analysis pipeline is always a trial and error. It sometimes requires referencing to other bibliographies and discussing with fellow researchers. My analysis pipeline always starts with the Exploratory Data Analysis (EDA). EDA helps me investigate what happens in my data and detect critical points that can be important for further analysis.

What I meant by EDA is that it is the first step for my analysis. However, it does not necessarily mean that EDA is the initial step of data analysis. It is an approach to analyse data that includes the summary of data main characteristics and graphical illustration. There is another term for the initial step of data analysis, which is Initial Data Analysis (IDA). IDA focuses on checking assumptions for model fitting, handling missing values, and making transformations of variables. …


Image for post
Image for post
Photo by Imat Bagja Gumilar on Unsplash

I started learning data science as an environmentalist. Statistics was and will always be my first go-to tool to organize data for solving real-life problems.

I studied a branch of environmental science that rarely anyone could ever think of as their first option to enter university. I studied forest and agricultural science. It is an interdisciplinary subject because I could focus not only on the forest, but also on plant physiology, genetics, ecology and landscape science, environmental science, epidemiology, and many more. …


Image for post
Image for post
Photo by Dose Media on Unsplash

Correlation is one of the statistics’ all time classic, yet it is still a busy measure that everyone uses in their analysis process. In classic interpretation, correlation is a measure of relationship or correspondence between two variables. This is usually visualized through a correlation plot and measured using correlation coefficient (r) that ranges between -1 to 1. The important takeaway from r is it shows the degree of relationship between two variables in terms how a change in one variable will lead to a change in the corresponding variable.

While interpreting the correlation plot is widely known and quite straightforward, the intuitive understanding on the equation of correlation coefficient (r) is less widely known. This is what the article is about. Why do the result is between -1 and 1; and where do the signs come from. …


Why not start considering median and pay more attention to our standard deviation?

Image for post
Image for post
This cover picture may not be related to the story post, but I want to use this space for the pictures I took.

Disclaimer: This post is my thinking process of the statistical measures that I think should be paid more attention for. I am not an expert in statistics, and in this article I’m just sharing my opinion.

I first started working using statistics when I wrote my bachelor’s thesis, but looking back to the manuscript, I could see that I was lacking the basic foundation of statistics. …

About

Firza Riany

Hi! I like to confuse people: I use data to study forest

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store