2017-12-06 16 views
-3

문자열이 포함 된 데이터 프레임에 열이 있습니다. 일부 열 값에는 Excel 파일에서 읽은 데이터가 있기 때문에 글 머리 기호 '•'가 포함됩니다. gsub을 사용하여 모든 글 머리 점을 대시 '-'로 바꾸고 싶습니다. 그러나 gsub을 사용하면 패턴으로 글 머리 점을 복사하여 붙여 넣을 때에도 작동하지 않습니다. 내가 함께 수동 테스트의 GSUB를 시도 할 때R에서 글 머리 기호 (•)를 대시로 바꾸는 방법

그러나 :

gsub("•", "-", "test • case •") 

내가 원하는 출력을 얻을.

수동으로 테스트 할 때 작동하지만 데이터 프레임의 데이터에는 작동하지 않는 이유는 무엇입니까? 내가 내 자신의 데이터와 시도를 줄 것이다

structure(list(Function = c("Tax Services", "Deloitte Risk and Financial Advisory", "Consulting", "Tax Services"), Specialty = c("Business Tax", "Cyber Risk", "Product & Solutions", "Business Tax"), ReqNum = c("E18NATTSRCRK012-FP", "E18ORLASRCGB001-USDC", "E18NYCCASCSW009-PS-C", "E18NATTMGRRK010-FP" ), Title = c("Senior – National Federal Tax – Investment Management – Financial Products", "Identity and Access Management - IBM- ISIM/ISAM- Senior Solution Engineer", "Client Delivery Consultant – Revenue Intellect", "Manager – National Federal Tax – Investment Management – Financial Products" ), Responsibilities = c("Work you’ll do: As a Senior Consultant on our iPACS Financial Products team, you will: - Assist with the development and use of custom technology tools related to the complex tax analysis and reporting needs of financial products - Assist with the design and implementation of proprietary technologies - Provide training and ongoing support to tax compliance teams - Provide tax services by utilizing proprietary technology tools - Supervise engagement teams and work with Managers and Senior Managers to provide day-to-day guidance to junior staff - Coordinate assignments between our US and US-India offices ", "Work you’ll do • Multitask and switch gears to meet changing priorities and tasks to accomplish goals/objectives. • Work in a distributed team environment where team members are spread across numerous locations and often communicate virtually. • Support a flexible work schedule (to include nights and weekends on occasion). • Comfortable performing task lead responsibilities for small to medium software projects. • Designing, implementing, and deploying IAM solutions to support regulatory requirements such as Sarbanes-Oxley. These IAM solutions help in ensuring segregation of duties while helping prevent fraud and unauthorized access. ", "Work You’ll Do: • Client facilitation – working with clients to communicate data needs, ensure their understanding (from both a business and technology standpoint) of the Revenue Intellect™ data request, and act as a primary point of contact during the implementation process • Delivery management – planning, tracking, and reporting implementation progress using structured tools and methodologies • Data validation/profiling - reviewing client data to ensure validity and completeness, and working with clients to revise as necessary; identifying, investigating, and resolving issues and irregularities in data through both implementation and testing phases of a project • Client training – conducting on-site Revenue Intellect™ training of client end users following application go-live, as well as subsequent version release trainings • Client relationship development – establishing relationships with key client end users to help drive further adoption of the application throughout and at all levels of their organization • Insight development – using revenue cycle and general healthcare subject matter knowledge to help clients draw insights from their data and turn data into actionable change • Product development input – liaising with our product development team to communicate potential application enhancements for future releases that have been identified through conversations with both internal and external clients ", "Work you’ll do: As a Manager on our iPACS Financial Products team, you will: - Manage the development and use of custom technology tools to assist with the complex tax analysis and reporting needs of financial products - Be responsible for day-to-day management of multiple client engagements and supervise teams in both our U.S. and India-based offices - Assist with the design and implementation of proprietary technologies - Interact directly with clients and other regional office Deloitte teams - Manage the economics of client projects - Mentor and develop engagement staff including in our U.S. India office, providing leadership, counseling, career guidance, and guidance on issues related to work/life fit and retention " )), .Names = c("Function", "Specialty", "ReqNum", "Title", "Responsibilities" ), row.names = c(692L, 30L, 693L, 691L), class = "data.frame")

+5

당신은 우리에게'행의 몇 원본 데이터의'•'포함'dput (df.strCol $의 COLUMN_NAME)의 결과를 줄 수 있습니까? – thelatemail

+0

예 ** NA_character _ ** –

+0

'column_name'을 글 머리 기호가있는 컬럼의 이름으로 변경 했습니까? 'dput'은 여기에 붙여 넣을 수있는 텍스트 버전의 텍스트를 제공해야합니다. 그래서 다른 사람들이 당신이하고있는 것을 알아낼 수 있습니다. – thelatemail

답변

1

: 여기

내 데이터의 샘플입니다. 필자의 경우, 글 머리 기호는 열 이름에 있습니다.

나는 dput의 도움()와 데이터 프레임의 재현 예를 제공

df <- structure(list(`•T1` = "a", `•T2` = "b", `•T3` = "c", `•T4` = "a", `•T5` = "b"), 
.Names = c("•T1", "•T2", "•T3", "•T4", "•T5"), class = "data.frame", row.names = 1L) 
  1. 첫 번째 전략 : 객체에 하나의 글 머리 기호 문자를 캡처 한 다음 GSUB 해당 개체를 사용하여 현.

    bullet_object <- substr(colnames(df)[1], start = 1, stop =1) 
    gsub(bullet_object, "-", colnames(df)) 
    
  2. 두 번째 전략 : 정규 표현식에서 클래스 [] : 인쇄 []의 부정을 사용합니다. 이 문자는 영숫자, 공백, 캐리지 리턴 또는 구두점 문자가 아닌 모든 문자를 포함합니다. 글 머리 기호는이 범주에 속하지만 다른 흔치 않은 문자도 영향을 받으므로주의해야합니다. (- utf8_normalize``전 즉)

    gsub("[^[:print:]]", "-", colnames(df))