Process rows of R dataframe without loop in memory efficient way

The structure of my dataframe data1, which has over 1.5 million rows, is like this:

I need to insert a column Exit.time using values in columns WEEK and END and a cutoff value, which is 1287. Exit.time should have 0 or 1 value based on the following logic:

if WEEK = 1287, then Exit.time = 0.

if Week not equal to 1287, but WEEK = END then Exit.time = 1, otherwise Exit.time = 0.

For this I tried the following for loop and it does what is required in the above dummy data set.

The problem is that when I use the above loop in my real data set, even after an hour I am not getting an output. I guess looping is not efficient given the size of the dataset. Is there an alternative way to do what I want? I prefer to maintain the order of rows in data1 since I need to do some merge operations later on.

Since you need Exit.time to be 1 when (WEEK == END) & WEEK != 1287 and 0 otherwise, you can use as.numeric on the results of (WEEK == END) & WEEK != 1287, which changes TRUE to 1 and FALSE to 0.

There are multiple ways to code this, mostly differing in the semantics, they are fundamentally doing the same thing

Base R:

This involves typing data1 a lot, so there is a short-cut:

Tidyverse:
Tidyverse is a suite of packages which are great at manipulating data. We are using the package dplyr, which is part of tidyverse, so you can either load the whole thing, or just dplyr:

(I convert from TRUE/FALSE to 0/1 by multiplying by 1. It's less to type)

We can use case_when from dplyr.

Using data.table:

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

cI9 Ct02iSBa3Ogj CRm7Fxx37DXSv7OfJg9QT,OZA,w VdC7wPsQIEUvXrmJJBT6zZuRHrM,biz6XaQ1o7ergkEspssi K

搜尋此網誌

njtynhj

Process rows of R dataframe without loop in memory efficient way

Popular posts from this blog

The Dalles, Oregon

영화 미래의 미라이 다시보기 (2018) 다운로드 링크 무료보기

Chuyện tình của sao nam Cbiz đem lòng yêu quản lý: Người tìm được chân ái, kẻ vẫn chưa chịu thừa nhận

Process rows of R dataframe without loop in memory efficient way

Related Preview

Popular posts from this blog

The Dalles, Oregon

영화 미래의 미라이 다시보기 (2018) 다운로드 링크 무료보기

Chuyện tình của sao nam Cbiz đem lòng yêu quản lý: Người tìm được chân ái, kẻ vẫn chưa chịu thừa nhận