Create column based on specific values in other columns
NickName:Troels Ask DateTime:2022-04-29T01:06:50

Create column based on specific values in other columns

I am trying to create a new column based on specific values in several (four) other columns.

Reprex:

col1 <- c("a", "", "a")
col2 <- c("", "b", "b")
df <- data.frame(col1, col2)

I have tried with case_when, but cannot get it to apply to all values.

df$col3 <- with(df, case_when(col1 %in% c("a") & col2 %in% c("b") ~ "c" ))

What I want to achieve:
| col1|   col2   |  col3   |
|:--: | :-------:|:-------:|
|   a |          |    a    |
|     |     b    |    b    |
|   a |     b    |    c    |

So if col1 is a, col2 is na then col 3 is a. If col1 is na, col2 b then col3 should be b and finally if col1 is a, col2 b then col3 should be c

Copyright Notice:Content Author:「Troels」,Reproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/72047691/create-column-based-on-specific-values-in-other-columns

Answers
TarJae 2022-04-28T17:36:55

Here is how we could do it with case_when, first just replace blank cells with NA\nlibrary(dplyr) \n\ndf %>% \n mutate(across(starts_with("col"), na_if,""),\n col3 = case_when(\n col1=="a" & is.na(col2) ~ "a",\n is.na(col1) & col2 == "b" ~ "b",\n col1 == "a" & col2 == "b" ~ "c",\n TRUE ~ NA_character_))\n\n col1 col2 col3\n1 a <NA> a\n2 <NA> b b\n3 a b c\n",


More about “Create column based on specific values in other columns” related questions

Create column based on specific values in other columns

I am trying to create a new column based on specific values in several (four) other columns. Reprex: col1 &lt;- c(&quot;a&quot;, &quot;&quot;, &quot;a&quot;) col2 &lt;- c(&quot;&quot;, &quot;b&qu

Show Detail

Pandas create boolean column based on equality of other columns

Is there a simple way to dynamically (!!!) create a boolean column in a Dataframe, based on the values of the other columns, by checking if the values are equal? My DF: df = pd.DataFrame({&quot;col...

Show Detail

Create columns where sum other columns based on conditional of other column values?

I need to create columns where sum other columns based on conditional of other column values. The requirement is as below. I have the following table: key code1 code2 code3 code4 value1 value2 v...

Show Detail

Update column values based on other columns

I have a weak grasp of Pandas and not a strong understanding of Python. I am wanting to update a column (d.Alias) based on the value of existing columns (d.Company and d2.Alias). d.Alias should be...

Show Detail

Set values in a column based on the values of other columns as a group

I have a df that looks something like this: name A B C D 1 bar 1 0 1 1 2 foo 0 0 0 1 3 cat 1 0-1 0 4 pet 0 0 0 1 5 ser 0 0-1 0 6 chet 0 0 0 1 I need to use loc method to add values in a new

Show Detail

Grouping a column based on values on other columns to create new columns in pandas

I have a dataframe which looks something like this: dfA name group country registration X engg Thailand True A engg Peru True B eng...

Show Detail

create a column string of ordered values based on other dataset columns

I ask this because I feel like I've over complicated my current solution and I'm hoping to find something that makes more sense. I want to create a column that contains a sorted comma separated st...

Show Detail

PySpark: Create new column based on null values in other columns

I am working on a PySpark transformation to create a new column based on null values in another columns. Below is the sample input dataframe: Input DataFrame This is the expected output dataframe: ...

Show Detail

Create new character column based on other columns in R?

Similar questions have been asked about creating a column based on other columns. However, none seem to have a solution for my specific problem. I am trying to create a new column that will be a mi...

Show Detail

Create a column based on two other columns

I want to create a column based on the values of two other columns; here's my code: import pandas as pd df = pd.DataFrame({'pre_decrease_addfund':[7, 50, 0, 44, 41], 'decrease':[0...

Show Detail