R因子用法详解

因子的属性

R中有一个因子的以下属性

1. X
输入向量将被转换为一个因子。
2. 等级
它是一个输入向量, 代表由x占用的一组唯一值。
3. 标签
它是一个字符向量, 对应于标签数。
4. 排除
它用于指定我们要排除的值,
5. 下令
这是一个逻辑属性, 它确定是否对级别进行排序。
6. 最大值
用于指定最大级别数的上限。

如何创建一个因子？

1. 第一步, 我们创建一个向量。
2. 下一步是将向量转换为因子,

R提供factor()函数以将向量转换为factor。 factor()函数具有以下语法

``factor_data<- factor(vector)``

``````# Creating a vector as input.
data <- c("Shubham", "Nishka", "Arpita", "Nishka", "Shubham", "Sumit", "Nishka", "Shubham", "Sumit", "Arpita", "Sumit")

print(data)
print(is.factor(data))

# Applying the factor function.
factor_data<- factor(data)

print(factor_data)
print(is.factor(factor_data))``````

``````[1] "Shubham" "Nishka"  "Arpita"  "Nishka"  "Shubham" "Sumit"   "Nishka"
[8] "Shubham" "Sumit"   "Arpita"  "Sumit"
[1] FALSE
[1] Shubham Nishka Arpita Nishka Shubham Sumit Nishka Shubham Sumit
[10] Arpita Sumit
Levels: Arpita Nishka Shubham Sumit
[1] TRUE``````

访问因子的组成部分

``````# Creating a vector as input.
data <- c("Shubham", "Nishka", "Arpita", "Nishka", "Shubham", "Sumit", "Nishka", "Shubham", "Sumit", "Arpita", "Sumit")

# Applying the factor function.
factor_data<- factor(data)

#Printing all elements of factor
print(factor_data)

#Accessing 4th element of factor
print(factor_data[4])

#Accessing 5th and 7th element
print(factor_data[c(5, 7)])

#Accessing all elemcent except 4th one
print(factor_data[-4])

#Accessing elements using logical vector
print(factor_data[c(TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, TRUE)])``````

``````[1] Shubham Nishka Arpita Nishka Shubham Sumit Nishka Shubham Sumit
[10] Arpita Sumit
Levels: Arpita Nishka Shubham Sumit

[1] Nishka
Levels: Arpita Nishka Shubham Sumit

[1] Shubham Nishka
Levels: Arpita Nishka Shubham Sumit

[1] Shubham Nishka Arpita Shubham Sumit Nishka Shubham Sumit Arpita
[10] Sumit
Levels: Arpita Nishka Shubham Sumit

[1] Shubham Shubham Sumit Nishka Sumit
Levels: Arpita Nishka Shubham Sumit``````

因子修改

``````# Creating a vector as input.
data <- c("Shubham", "Nishka", "Arpita", "Nishka", "Shubham")

# Applying the factor function.
factor_data<- factor(data)

#Printing all elements of factor
print(factor_data)

#Change 4th element of factor with sumit
factor_data[4] <-"Arpita"
print(factor_data)

#change 4th element of factor with "Gunjan"
factor_data[4] <- "Gunjan"    # cannot assign values outside levels
print(factor_data)

#Adding the value to the level
levels(factor_data) <- c(levels(factor_data), "Gunjan")#Adding new level
factor_data[4] <- "Gunjan"
print(factor_data)``````

``````[1] Shubham Nishka Arpita Nishka Shubham
Levels: Arpita Nishka Shubham
[1] Shubham Nishka Arpita Arpita Shubham
Levels: Arpita Nishka Shubham
Warning message:
In `[<-.factor`(`*tmp*`, 4, value = "Gunjan") :
invalid factor level, NA generated
[1] Shubham Nishka Arpita

Shubham
Levels: Arpita Nishka Shubham
[1] Shubham Nishka Arpita Gunjan Shubham
Levels: Arpita Nishka Shubham Gunjan``````

数据框架中的因子

``````# Creating the vectors for data frame.
height <- c(132, 162, 152, 166, 139, 147, 122)
weight <- c(40, 49, 48, 40, 67, 52, 53)
gender <- c("male", "male", "female", "female", "male", "female", "male")

# Creating the data frame.
input_data<- data.frame(height, weight, gender)
print(input_data)

# Testing if the gender column is a factor.
print(is.factor(input_data\$gender))

# Printing the gender column to see the levels.
print(input_data\$gender)``````

``````height weight gender
1    132     40   male
2    162     49   male
3    152     48 female
4    166     40 female
5    139     67   male
6    147     52 female
7    122     53   male
[1] TRUE
[1] male   male   female female male   female male
Levels: female male``````

更改级别顺序

``````data <- c("Nishka", "Gunjan", "Shubham", "Arpita", "Arpita", "Sumit", "Gunjan", "Shubham")
# Creating the factors
factor_data<- factor(data)
print(factor_data)

# Apply the factor function with the required order of the level.
new_order_factor<- factor(factor_data, levels = c("Gunjan", "Nishka", "Arpita", "Shubham", "Sumit"))
print(new_order_factor)``````

``````[1] Nishka Gunjan Shubham Arpita Arpita Sumit Gunjan Shubham
Levels: Arpita Gunjan Nishka Shubham Sumit
[1] Nishka Gunjan Shubham Arpita Arpita Sumit Gunjan Shubham
Levels: Gunjan Nishka Arpita Shubham Sumit``````

产生因子水平

R提供gl()函数来生成因子水平。此函数采用三个参数, 即n, k和标签。在这里, n和k是整数, 表示我们想要多少个水平以及每个水平需要多少次。

gl()函数的语法如下：

``gl(n, k, labels)``
1. n表示级别数。
2. k表示复制数量。
3. 标签是结果因子水平的标签向量。

``````gen_factor<- gl(3, 5, labels=c("BCA", "MCA", "B.Tech"))
gen_factor``````

``````[1] BCA BCA BCA BCA BCA MCA MCA MCA MCA MCA
[11] B.Tech B.Tech B.Tech B.Tech B.Tech
Levels: BCA MCA B.Tech``````