2017-09-11 11 views
-1

나는 구조가 분석 된 트리에 해당하는 15200 라인의 탁월한 테이블을 가지고 있습니다. 저는 모든 구조가 기둥 (48 구조)에 있고, 모든 나무에 세어졌습니다. 예를 들어, 트리 (12607)는 3 개의 구조 CV11, 1 구조 IN12 및 나머지 모든 구조의 없음 (0)을 갖는다. 따라서 테이블은 0이 많은 거대한 테이블과 나무에 구조가 나타나는 숫자와 비슷합니다. 마지막 열은 발견 된 구조에 따라 트리에 주어진 값입니다 (각 구조는 트리의 존재에 의해 트리에 여러 가지 포인트를 부여합니다).두 데이터 프레임의 비교

질문 : 트리에 높은 가치를 부여하는 구조 또는 구조 조합이 있습니까? 물론 각 구조의 값에 따라 어느 것이 더 값이 큰지를 볼 수 있습니다 (예 : 구조체 CV11의 값은 15이고 구조체 IN12의 값은 4입니다). 그러나 내가 알고 싶은 것은 100보다 높은 최종 값을 가진 모든 나무를 가져 가면 (우리는 새로운 데이터 프레임 "data100"을 생성합니다) 최종 값이 100 미만인 나무와 비교합니다 (우리는 또 다른 데이터 프레임을 만듭니다 " data0 "),이 나무에서 발견되는 구조의 수와 발생에 상당한 차이가 있는지 알아볼 수 있습니까? 높은 값을 가진 구조는 아마도 100 이하의 값을 가진 나무에서만 발견 될 수 있습니다. 예를 들어이 구조는 다른 구조가 같은 트리에서 발견되지 못하기 때문입니다.

보라, 나는 충분한 세부 사항을 주셨으면합니다 ...이 문제를 해결하기위한 아이디어 나 제안이 있다면 .. 좋을 것입니다!

아래는 내 스크립트입니다.

> data100 
     CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13 
1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
4  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
5  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
6  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
10  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
11  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
12  0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 
13  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
14  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
15  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
     IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32 
1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2  0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 
3  0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 
4  0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 
5  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
6  0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
10  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
11  0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 
12  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 
13  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 
14  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 
15  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
     EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval 
1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
2  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  56 
3  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  10 
4  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  10 
5  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  4 
6  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  24 
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
10  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
11  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  18 
12  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  63 
13  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  77 
14  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  54 
15  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  20 
[ reached getOption("max.print") -- omitted 60749 rows ] 
> sortdata100<-data100[order(data100[,64],decreasing=T),] 

> rsortdata100<-sortdata100[sortdata100$ecoval>100,] 
> rsortdata100<-na.omit(rsortdata100)#181 lignes 
> rsortdata100 
     CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13 
1291  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1083  0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3919  0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 
14685 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
4021  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
5452  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
14686 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 
4022  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 
1013  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2895  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
4719  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 
682  0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 
3444  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1299  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
2713  0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 
     IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32 
1291  0 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1083  3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3919  0 0 1 0 2 0 0 0 2 0 0 0 3 0 0 0 0 0 0 11 0 0 0 
14685 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
4021  0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
5452  0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 
14686 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 2 
4022  0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1013  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2895  0 0 0 1 0 0 0 0 4 0 0 3 0 4 3 0 0 0 0 0 0 0 0 
4719  0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
682  0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 
3444  0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1299  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 
2713  0 0 0 2 0 3 0 0 2 0 0 0 1 5 1 0 0 0 0 0 0 0 0 
     EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval 
1291  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1192 
1083  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 424 
3919  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 380 
14685 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 370 
4021  0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 358 
5452  0 0 0 0 0 0 1 0 0 11 0 0 0 0 1 0 0 356 
14686 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 354 
4022  0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 346 
1013  0 8 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 326 
2895  0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 325 
4719  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 324 
682  0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 311 
3444  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 306 
1299  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 302 
2713  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 302 
[ reached getOption("max.print") -- omitted 166 rows ] 
> data0<-sortdata100[sortdata100$ecoval<100,] 
> data0<-na.omit(data0) 
> data0 
     CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13 
4728  0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 
5339  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
11766 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
796  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3561  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
10581 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
10618 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 
14376 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
14389 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 
790  0 0 0 1 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 1 0 0 
3974  0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
4739  0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 
156  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2740  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2950  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 
     IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32 
4728  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 
5339  1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 
11766 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 
796  1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3561  0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
10581 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 
10618 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
14376 1 0 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 
14389 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 
790  0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 
3974  0 0 0 0 0 0 0 0 1 0 0 0 4 0 0 0 1 0 0 0 0 0 0 
4739  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
156  0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 
2740  0 0 0 0 0 0 0 0 0 0 0 0 0 6 2 0 0 0 0 0 0 0 0 
2950  0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
     EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval 
4728  0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0  99 
5339  0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0  99 
11766 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1  99 
796  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
3561  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
10581 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0  98 
10618 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0  98 
14376 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
14389 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
790  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  97 
3974  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  97 
4739  0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 0  97 
156  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  96 
2740  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0  96 
2950  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  96 
[ reached getOption("max.print") -- omitted 14984 rows ] 
+2

죄송 명확하지 나에게, [좋은 질문을하는 방법] (HTTP에 대한 정보를 참조하십시오 : // [재현 가능한 예제]를 제공하는 방법 (http://stackoverflow.com/questions/5963269). 이렇게하면 다른 사람들이 당신을 도울 수있게됩니다. – zx8754

답변

0

아마도? 당신에게 ECOVAL ><= 각 컬럼의 평균을 제공한다

library(dplyr) 
data %>% group_by(ecoval > 100) %>% summarize_all(mean) 

100

+0

답변 해 주셔서 감사합니다. 나는 R의 결과를 해석하는 법을 잘 모른다. FALSE와 TRUE는 무엇인가? TRUE라는 줄에있는 평균입니까? –

+0

의'tibble : 2 × 65 1 FALSE ECOVAL> 100 CV11 CV12의 CV13의 CV14의 CV15의 CV21의 CV22의 CV23의 CV24 0.00299880 0.003398641 0.0003332001 0..0005331201 0.005997601 0.00206584 0.003531921 0.00146608 2 TRUE 0.03314917 0.154696133 0.0441988950 0.535911602 0.0552486188 0.060773481 0.03867403 0.077348066 0.03867403' –

+0

'ecoval> 100'조건으로 행을 그룹화하고 있으므로 'TRUE'가 포함 된 행은'ec '에 대한 데이터를 요약하는 행입니다 oval> 100'이고'FALSE'를 포함하는 행은 ecoval <= 100'에 대한 데이터를 포함합니다. –