电竞比分网-中国电竞赛事及体育赛事平台

分享

R語言的xlsx包的read.xlsx及read.xlsx2函數(shù)操作Excel的xlsx文件

 楓林秋2016 2019-09-01

這篇文章是對R包xlsx0.5.7版本的read.xlsxread.xlsx2函數(shù)的幫助文檔的翻譯。很多地方仍用英文是因?yàn)槲铱床欢?。我?yīng)該會長期維護(hù)這些博客,如有譯文錯誤請評論指正。

簡介

讀取一個工作表的內(nèi)容到一個R的data.frame對象。

用法

read.xlsx(file, sheetIndex, sheetName=NULL,
    rowIndex=NULL,
    startRow=NULL, endRow=NULL, colIndex=NULL,
    as.data.frame=TRUE, header=TRUE, colClasses=NA,
    keepFormulas=FALSE, encoding="unknown", ...)
read.xlsx2(file, sheetIndex, sheetName=NULL,
    startRow=1, colIndex=NULL, endRow=NULL,
    as.data.frame=TRUE, header=TRUE, colClasses="character", ...)

參數(shù)

file:Excel文件的路徑
sheetIndex :一個表示工作薄中的表的索引的數(shù)值
sheetName:表名
rowIndex:一個數(shù)值向量,表示想提取的行。如為空,且未指定startRow和endRow,則提取所有行。
colIndex :一個數(shù)值向量,表示想提取的列。如為空,則提取所有的列。
as.data.frame:布爾值,是否強(qiáng)制轉(zhuǎn)換為data.frame。如FALSE,則用列表表示,每個元素為一列。
header:布爾值,是否將第一行識別為標(biāo)題
colClasses:For read.xlsx a character vector that represent the class of each column. Recycled as necessary, or if the character vector is named, unspecified values are taken to be NA. For read.xlsx2 see readColumns.
keepFormulas:布爾值,是否以文本格式保留Excel公式
encoding:encoding to be assumed for input strings. See read.table.
startRow:數(shù)值,讀取的起點(diǎn)行。對于read.xlsx,僅當(dāng)參數(shù)rowIndexNULL時有效。
endRow :數(shù)值,讀取的終點(diǎn)行。如設(shè)為NULL,則讀取所有行。對于read.xlsx,僅當(dāng)參數(shù)rowIndexNULL時有效。

other arguments to data.frame, for example stringsAsFactors

細(xì)節(jié)

函數(shù)read.xlsx提供了一個讀取Excel數(shù)據(jù)的高級接口。它調(diào)用了多個低級函數(shù)。Its goal is to provide the conveniency of read.table by borrowing from its signature.

The function pulls the value of each non empty cell in the worksheet into a vector of type list by preserving the data type. If as.data.frame=TRUE, this vector of lists is then formatted into a rectangular shape. Special care is needed for worksheets with ragged data.

An attempt is made to guess the class type of the variable corresponding to each column in the worksheet from the type of the first non empty cell in that column. If you need to impose a specific class type on a variable, use the colClasses argument. It is recommended to specify the column classes and not rely on R to guess them, unless in very simple cases.

Excel內(nèi)部將日期與時間保存為數(shù)值類型,并不保留時區(qū)與夏令時數(shù)據(jù)。在讀取一個日期時間類型的數(shù)據(jù)時,it is converted to POSIXct class with a GMT timezone. Occasional rounding errors may appear and the R and Excel string representation my differ by one second. For read.xlsx2 bring in a datetime column as a numeric one and then convert to class POSIXct or Date. Also rounding the POSIXct column in R usually does the trick too.

read.xlsx2函數(shù)將更多的工作放在了Java中,所以它有更快的運(yùn)行速度 (an order of magnitude faster on sheets with 100,000 cells or more)。read.xlsx2的結(jié)果通常與read.xlsx不同,因?yàn)?code>read.xlsx2內(nèi)部用readColumns實(shí)現(xiàn),readColumns is tailored for tabular data.

返回值

一個data.frame對象或一個list對象,這取決于as.data.frame的值。如發(fā)現(xiàn)整列的NA,可能colClasses參數(shù)設(shè)置錯誤。

如果表為空,則返回NULL。如果表不存在,報錯。

作者

Adrian Dragulescu

另見

write.xlsx可以寫Excel文檔。另見readColumns,可以僅讀取幾個列。

示例

## Not run: 

file <- system.file("tests", "test_import.xlsx", package = "xlsx")
res <- read.xlsx(file, 1)  # read first sheet
head(res)
#          NA. Population Income Illiteracy Life.Exp Murder HS.Grad Frost   Area
# 1    Alabama       3615   3624        2.1    69.05   15.1    41.3    20  50708
# 2     Alaska        365   6315        1.5    69.31   11.3    66.7   152 566432
# 3    Arizona       2212   4530        1.8    70.55    7.8    58.1    15 113417
# 4   Arkansas       2110   3378        1.9    70.66   10.1    39.9    65  51945
# 5 California      21198   5114        1.1    71.71   10.3    62.6    20 156361
# 6   Colorado       2541   4884        0.7    72.06    6.8    63.9   166 103766
# >


# To convert an Excel datetime colum to POSIXct, do something like:
#   as.POSIXct((x-25569)*86400, tz="GMT", origin="1970-01-01")
# For Dates, use a conversion like:
#   as.Date(x-25569, origin="1970-01-01") 

res2 <- read.xlsx2(file, 1)  


## End(Not run)

    本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請點(diǎn)擊一鍵舉報。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評論

    發(fā)表

    請遵守用戶 評論公約

    類似文章 更多