purr map walk 学习教程 完整版教程学习

2023-10-12 02:52

本文主要是介绍purr map walk 学习教程 完整版教程学习,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

Function reference • purrricon-default.png?t=N7T8https://purrr.tidyverse.org/reference/index.htmlMap over multiple input simultaneously (in "parallel") — pmap • purrr

11 Other purrr functions | Functional Programming (stanford.edu)

关注微信:生信小博士

11.1 Map functions that output tibbles

Instead of creating an atomic vector or list, the map variants map_dfr() and map_dfc() create a tibble.

With these map functions, the assembly line worker creates a tibble for each input element, and the output conveyor belt ends up with a collection of tibbles.

The worker then combines all the small tibbles into a single, larger tibble. There are multiple ways to combine smaller tibbles into a larger tibble. map_dfr() (r for rows) stacks the smaller tibbles on top of each other.

map_dfc() (c for columns) stacks them side-by-side.

There are _dfr and _dfc variants of pmap() and map2() as well. In the following sections, we’ll cover map_dfr() and map_dfc() in more detail.

11.1.1 _dfr

map_dfr() is useful when reading in data from multiple files. The following code reads in several very simple csv files, each of which contains the name of a different dinosaur genus.

read_csv("data/purrr-extras/file_001.csv")
#> # A tibble: 1 × 2
#>      id genus        
#>   <dbl> <chr>        
#> 1     1 Hoplitosaurusread_csv("data/purrr-extras/file_002.csv")
#> # A tibble: 1 × 2
#>      id genus        
#>   <dbl> <chr>        
#> 1     2 Herrerasaurusread_csv("data/purrr-extras/file_003.csv")
#> # A tibble: 1 × 2
#>      id genus      
#>   <dbl> <chr>      
#> 1     3 Coelophysis

read_csv() produces a tibble, and so we can use map_dfr() to map over all three file names and bind the resulting individual tibbles into a single tibble.

files <- str_glue("data/purrr-extras/file_00{1:3}.csv")
files
#> data/purrr-extras/file_001.csv
#> data/purrr-extras/file_002.csv
#> data/purrr-extras/file_003.csvfiles %>% map_dfr(read_csv)
#> # A tibble: 3 × 2
#>      id genus        
#>   <dbl> <chr>        
#> 1     1 Hoplitosaurus
#> 2     2 Herrerasaurus
#> 3     3 Coelophysis

The result is a tibble with three rows and two columns, because map_dfr() aligns the columns of the individual tibbles by name.

The individual tibbles can have different numbers of rows or columns. map_dfr() just creates a column for each unique column name. If some of the individual tibbles lack a column that others have, map_dfr() fills in with NA values.

read_csv("data/purrr-extras/file_004.csv")
#> # A tibble: 2 × 3
#>      id genus         start_period 
#>   <dbl> <chr>         <chr>        
#> 1     4 Dilophosaurus Sinemurian   
#> 2     5 Segisaurus    Pliensbachianc(files, "data/purrr-extras/file_004.csv") %>% map_dfr(read_csv)
#> # A tibble: 5 × 3
#>      id genus         start_period 
#>   <dbl> <chr>         <chr>        
#> 1     1 Hoplitosaurus <NA>         
#> 2     2 Herrerasaurus <NA>         
#> 3     3 Coelophysis   <NA>         
#> 4     4 Dilophosaurus Sinemurian   
#> 5     5 Segisaurus    Pliensbachian

11.1.2 _dfc

map_dfc() is typically less useful than map_dfr() because it relies on row position to stack the tibbles side-by-side. Row position is prone to error, and it will often be difficult to check if the data in each row is aligned correctly. However, if you have data with variables in different places and are positive the rows are aligned, map_dfc() may be appropriate.

Unfortunately, even if the individual tibbles contain a unique identifier for each row, map_dfc() doesn’t use the identifiers to verify that the rows are aligned correctly, nor does it combine identically named columns.

read_csv("data/purrr-extras/file_005.csv")
#> # A tibble: 1 × 3
#>      id diet      start_period
#>   <dbl> <chr>     <chr>       
#> 1     1 herbivore Barremianc("data/purrr-extras/file_001.csv", "data/purrr-extras/file_005.csv") %>% map_dfc(read_csv)
#> # A tibble: 1 × 5
#>   id...1 genus         id...3 diet      start_period
#>    <dbl> <chr>          <dbl> <chr>     <chr>       
#> 1      1 Hoplitosaurus      1 herbivore Barremian

Instead, you end up with a duplicated column (id...1 and id...3).

If you have a unique identifier for each row, it is much better to join on that identifier.

left_join(read_csv("data/purrr-extras/file_001.csv"),read_csv("data/purrr-extras/file_005.csv"),by = "id"
)
#> # A tibble: 1 × 4
#>      id genus         diet      start_period
#>   <dbl> <chr>         <chr>     <chr>       
#> 1     1 Hoplitosaurus herbivore Barremian

Also, because map_dfc() combines tibbles by row position, the tibbles can have different numbers of columns, but they should have the same number of rows.

11.2 Walk

The walk functions work similarly to the map functions, but you use them when you’re interested in applying a function that performs an action instead of producing data (e.g., print()).

The walk functions are useful for performing actions like writing files and printing plots. For example, say we used purrr to generate a list of plots.

set.seed(745)plot_rnorm <- function(sd) {tibble(x = rnorm(n = 5000, mean = 0, sd = sd)) %>% ggplot(aes(x)) +geom_histogram(bins = 40) +geom_vline(xintercept = 0, color = "blue")
}plots <-c(5, 1, 9) %>% map(plot_rnorm)

We can now use walk() to print them out.

plots %>% walk(print)

这篇关于purr map walk 学习教程 完整版教程学习的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/192734

相关文章

Nexus安装和启动的实现教程

《Nexus安装和启动的实现教程》:本文主要介绍Nexus安装和启动的实现教程,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录一、Nexus下载二、Nexus安装和启动三、关闭Nexus总结一、Nexus下载官方下载链接:DownloadWindows系统根

Java中Map.Entry()含义及方法使用代码

《Java中Map.Entry()含义及方法使用代码》:本文主要介绍Java中Map.Entry()含义及方法使用的相关资料,Map.Entry是Java中Map的静态内部接口,用于表示键值对,其... 目录前言 Map.Entry作用核心方法常见使用场景1. 遍历 Map 的所有键值对2. 直接修改 Ma

Go学习记录之runtime包深入解析

《Go学习记录之runtime包深入解析》Go语言runtime包管理运行时环境,涵盖goroutine调度、内存分配、垃圾回收、类型信息等核心功能,:本文主要介绍Go学习记录之runtime包的... 目录前言:一、runtime包内容学习1、作用:① Goroutine和并发控制:② 垃圾回收:③ 栈和

CnPlugin是PL/SQL Developer工具插件使用教程

《CnPlugin是PL/SQLDeveloper工具插件使用教程》:本文主要介绍CnPlugin是PL/SQLDeveloper工具插件使用教程,具有很好的参考价值,希望对大家有所帮助,如有错... 目录PL/SQL Developer工具插件使用安装拷贝文件配置总结PL/SQL Developer工具插

Java中JSON格式反序列化为Map且保证存取顺序一致的问题

《Java中JSON格式反序列化为Map且保证存取顺序一致的问题》:本文主要介绍Java中JSON格式反序列化为Map且保证存取顺序一致的问题,具有很好的参考价值,希望对大家有所帮助,如有错误或未... 目录背景问题解决方法总结背景做项目涉及两个微服务之间传数据时,需要提供方将Map类型的数据序列化为co

Java中的登录技术保姆级详细教程

《Java中的登录技术保姆级详细教程》:本文主要介绍Java中登录技术保姆级详细教程的相关资料,在Java中我们可以使用各种技术和框架来实现这些功能,文中通过代码介绍的非常详细,需要的朋友可以参考... 目录1.登录思路2.登录标记1.会话技术2.会话跟踪1.Cookie技术2.Session技术3.令牌技

Android学习总结之Java和kotlin区别超详细分析

《Android学习总结之Java和kotlin区别超详细分析》Java和Kotlin都是用于Android开发的编程语言,它们各自具有独特的特点和优势,:本文主要介绍Android学习总结之Ja... 目录一、空安全机制真题 1:Kotlin 如何解决 Java 的 NullPointerExceptio

Python使用Code2flow将代码转化为流程图的操作教程

《Python使用Code2flow将代码转化为流程图的操作教程》Code2flow是一款开源工具,能够将代码自动转换为流程图,该工具对于代码审查、调试和理解大型代码库非常有用,在这篇博客中,我们将深... 目录引言1nVflRA、为什么选择 Code2flow?2、安装 Code2flow3、基本功能演示

Java Spring 中的监听器Listener详解与实战教程

《JavaSpring中的监听器Listener详解与实战教程》Spring提供了多种监听器机制,可以用于监听应用生命周期、会话生命周期和请求处理过程中的事件,:本文主要介绍JavaSprin... 目录一、监听器的作用1.1 应用生命周期管理1.2 会话管理1.3 请求处理监控二、创建监听器2.1 Ser

MySQL 安装配置超完整教程

《MySQL安装配置超完整教程》MySQL是一款广泛使用的开源关系型数据库管理系统(RDBMS),由瑞典MySQLAB公司开发,目前属于Oracle公司旗下产品,:本文主要介绍MySQL安装配置... 目录一、mysql 简介二、下载 MySQL三、安装 MySQL四、配置环境变量五、配置 MySQL5.1