autoplot() + scale_color_manual() 함수를 사용하여 범례에 누락 된 레벨

3 개의 다른 침대 파일 (SNP 데이터가있는 파일 하나, 삭제 된 파일 하나, 중복 된 데이터가있는 세 번째 파일)을 플롯하려하지만, 범례는 같은 파일에 데이터를 모두 넣지 않는 한 세 개의 레이어의 값을 포함합니다. 세 파일을 하나의 파일로 결합하는 문제는 변수의 각 레벨에 ylims를 설정할 수 없다는 것입니다.autoplot() + scale_color_manual() 함수를 사용하여 범례에 누락 된 레벨

chr10 47000019 47000019 rs150696937 2 + 
chr11 1017064 1017064 NA 2 + 
chr11 1017280 1017280 rs199539548 2 + 
chr11 1017294 1017294 NA 2 + 
chr11 1017756 1017756 NA 2 + 
chr13 31898038 31898038 rs200460848 2 + 
chr13 40298639 40298639 NA 2 + 
chr13 48996928 48996928 rs530812916 2 + 
chr13 50204777 50204777 rs117251022 2 + 
chr14 20216005 20216005 rs566685404 2 + 
chr14 20404076 20404076 rs114526346 2 + 
chr21 10944668 10944668 rs138088406 2 +

난에서 플롯 할 변형 유형의 종류를 지정 점수 열을 사용하고 있습니다 :

이

내 입력 파일 중 하나를합니다 (SNP 정보를 포함하는 하나)의 예입니다 다음과 같은 방법 : "1"= 삭제; "2"= SNP 및 "3"= 중복.

이들은 내가 사용하는 라이브러리는 다음과 같습니다 GRANGES 객체로 내 침대의 파일을 변환 http://davetang.org/muse/2015/02/04/bed-granges/ :

## Load libraries and required databases 
library(ggbio) 
data(hg19IdeogramCyto, package = "biovizBase") 

library(GenomicRanges) 
hg19 <- keepSeqlevels(hg19IdeogramCyto, paste0("chr", c(1:22, "X", "Y"))) 

biovizBase::isIdeogram(hg19) 

data("hg19IdeogramCyto", package = "biovizBase") 

data("hg19Ideogram", package = "biovizBase")

나는이 웹 사이트에서 사용할 수있는 Bed2GRanges 기능을 사용합니다.

내 데이터 플롯

## Import bed files as GRanges file 
SNP <- bed_to_granges("SNPs.bed") 
seqlengths(SNP) <- seqlengths(hg19Ideogram)[names(seqlengths(SNP))] 
SNP_dn <- keepSeqlevels(SNP, paste0("chr", c(1:22, "X", "Y")))

# Required Bed2GRanges function 

# BED to GRanges 
# 
# This function loads a BED-like file and stores it as a GRanges object. 
# The tab-delimited file must be ordered as 'chr', 'start', 'end', 'id', 'score', 'strand'. 
# The minimal BED file must have the 'chr', 'start', 'end' columns. 
# Any columns after the strand column are ignored. 
# 
# @param file Location of your file 
# @keywords BED GRanges 
# @export 
# @examples 
# bed_to_granges('my_bed_file.bed') 

bed_to_granges <- function(file){ 
     df <- read.table(file, 
         header=F, 
         stringsAsFactors=F) 

     if(length(df) > 6){ 
       df <- df[,-c(7:length(df))] 
     } 

     if(length(df)<3){ 
       stop("File has less than 3 columns") 
     } 

     header <- c('chr','start','end','id','score','strand') 
     names(df) <- header[1:length(names(df))] 

     if('strand' %in% colnames(df)){ 
       df$strand <- gsub(pattern="[^+-]+", replacement = '*', x = df$strand) 
     } 

     library("GenomicRanges") 

     if(length(df)==3){ 
       gr <- with(df, GRanges(chr, IRanges(start, end))) 
     } else if (length(df)==4){ 
       gr <- with(df, GRanges(chr, IRanges(start, end), id=id)) 
     } else if (length(df)==5){ 
       gr <- with(df, GRanges(chr, IRanges(start, end), id=id, score=as.character(score))) 
     } else if (length(df)==6){ 
       gr <- with(df, GRanges(chr, IRanges(start, end), id=id, score=as.character(score), strand=strand)) 
     } 
     return(gr) 
}

내가 내 bedfile을 가져 내가 옵션 drop = FALSE을 지정하더라도

#Plotting SNP_dn according to score column 
test <- autoplot(SNP_dn, aes(color = score)) + 
     scale_color_manual("Variant type", 
          values = score <- c("black", "red", "blue"), 
          breaks = c("2","1","3"), 
          drop = FALSE, 
          labels = c("SNP", "Deletion", "Duplication")) + 
     theme(legend.position = "right") 

test

을, 난 항상 수준을 그리워 "삭제"및 "중복 "전설에서.

저는 며칠 동안이 문제에 어려움을 겪었지만 해결 방법을 찾을 수는 없습니다.

scale_color_manual() 함수 (예 : "SNP", "삭제", "복제")로 지정한 세 가지 레벨을 포함하는 범례를 갖고 싶습니다. 침대 파일.

R version 3.3.1 (2016-06-21) 
Platform: x86_64-w64-mingw32/x64 (64-bit) 
Running under: Windows 7 x64 (build 7601) Service Pack 1 

locale: 
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C       
[5] LC_TIME=English_United States.1252  

attached base packages: 
[1] stats4 parallel stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] biovizBase_1.20.0 ggbio_1.20.2   GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1  
[6] S4Vectors_0.10.3  ggplot2_2.1.0  BiocGenerics_0.18.0 

loaded via a namespace (and not attached): 
[1] Rcpp_0.12.7     lattice_0.20-34    Rsamtools_1.24.0    
[4] Biostrings_2.40.2    digest_0.6.10     mime_0.5      
[7] R6_2.1.3      plyr_1.8.4     chron_2.3-47     
[10] acepack_1.3-3.3    RSQLite_1.0.0     BiocInstaller_1.22.3   
[13] httr_1.2.1     zlibbioc_1.18.0    GenomicFeatures_1.24.5  
[16] data.table_1.9.6    rpart_4.1-10     Matrix_1.2-7.1    
[19] labeling_0.3     splines_3.3.1     BiocParallel_1.6.6   
[22] AnnotationHub_2.4.2   stringr_1.1.0     foreign_0.8-67    
[25] RCurl_1.95-4.8    biomaRt_2.28.0    munsell_0.4.3    
[28] shiny_0.14     httpuv_1.3.3     rtracklayer_1.32.2   
[31] htmltools_0.3.5    nnet_7.3-12     SummarizedExperiment_1.2.3 
[34] gridExtra_2.2.1    interactiveDisplayBase_1.10.3 Hmisc_3.17-4     
[37] XML_3.98-1.4     reshape_0.8.5     GenomicAlignments_1.8.4  
[40] bitops_1.0-6     RBGL_1.48.1     grid_3.3.1     
[43] xtable_1.8-2     GGally_1.2.0     gtable_0.2.0     
[46] DBI_0.5-1      magrittr_1.5     scales_0.4.0     
[49] graph_1.50.0     stringi_1.1.1     XVector_0.12.1    
[52] reshape2_1.4.1    latticeExtra_0.6-28   Formula_1.2-1    
[55] RColorBrewer_1.1-2   ensembldb_1.4.7    tools_3.3.1     
[58] dichromat_2.0-0    OrganismDbi_1.14.1   BSgenome_1.40.1    
[61] Biobase_2.32.0    survival_2.39-5    AnnotationDbi_1.34.4   
[64] colorspace_1.2-6    cluster_2.0.4     VariantAnnotation_1.18.7

는

보다도,

출처

2016-09-16 Yatrosin

'scale_color_manual'에서'breaks '대신'limits'를 사용해야한다고 생각합니다. – aosmith

고맙습니다. 원본 데이터 세트가 취할 수있는 다른 값을 지정하지 않았고 factor() 함수로 지정해야한다는 것을 알지 못했습니다. 이제 작동합니다. – Yatrosin

한계에 관한 것은 아니지만이 필드에 포함될 수있는 값을 지정하지 않은 원래 데이터 세트입니다. – Yatrosin

하나의 옵션이 요소는 플롯하려는 모든 레벨을 포함하고 있는지 확인하는 것입니다, 당신에게 대단히 감사합니다. 이렇게하면 drop = FALSE이 효과적입니다.

factor 및 levels 인수를 통해이 작업을 수행 할 수 있습니다. 예를 들어, 내가 :: CYL mtcars하는 수준 5를 추가하고자한다면 :

mtcars$cyl = factor(mtcars$cyl, levels = c("4", "5", "6", "8"))

또 다른 옵션은 scale_color_manual에 limits와 breaks을 교체하는 것입니다. 이 방법은 데이터의 실제 요소 수준에 의존하지 않습니다 (따라서 drop = FALSE은 아무 것도하지 않습니다).

scale_color_manual("Variant type", 
       values = c("black", "red", "blue"), 
       limits = c("2","1","3"), 
       labels = c("SNP", "Deletion", "Duplication"))

출처

2016-09-16 17:43:01 aosmith

autoplot() + scale_color_manual() 함수를 사용하여 범례에 누락 된 레벨

답변

관련 문제