How to separate values of multiple class in R?

r

#1

Hello R Users,

I’ve got stuck at this problem. I have a column highlighting ticket number.
Some of the values are numerical and some are alpha numeric. I wish to separate these two values by creating a new column.

In this new column, I wish to assign 1 to alpha numeric values and 0 to numeric values. How can I do that?

Ticket BA 21171 17599 STONEV. 3101282 113803 373450 330877 STD2 9549 349909 347742 237736 PP 9549 113783 A/5. 2151 347082 350406 248706 382652 244373 345763 2649 239865 248698 330923 113788 349909 347077 2631 19950 330959 349216 PC 17601 PC 17569 335677 C.A. 24579 PC 17604 113789 2677 A./5. 2152 345764 2651 7546 11668 349253 SC/ 2123 330958 S.C./A.4. 23567 370371 14311 2662 349237 3101295 A/4. 39886 PC 17572 2926 113509 19947 C.A. 31026 2697 C.A. 34651 CA 2144 2669 113572 36973 347088 PC 17605 2661 C.A. 29395 S.P. 3464 3101281 315151 C.A. 33111 CA 2144 S.O.C. 14879 2680 1601 348123 349208 374746 248738 364516 345767 345779 330932 113059 SO/C 14885 3101278 W./C. 6608 SOTON/OQ 392086 19950 343275 343276 347466 W.E.P. 5734 C.A. 2315 364500 374910 PC 17754 PC 17759 231919 244367 349245 349215 35281 7540 3101276 349207 343120 312991 349249 371110 110465 2665 324669 4136 2627 STON/O 2. 3101294 370369 11668 PC 17558 347082 S.O.C. 14879 A4. 54510 237736 27267 35281 2651 370372 C 17369 2668 347061 349241 SOTON/O.Q. 3101307 A/5. 3337 228414 C.A. 29178 SC/PARIS 2133 11752 113803 7534 PC 17593 2678 347081 STON/O2. 3101279 365222 231945 C.A. 33112 350043 W./C. 6608 230080 244310 S.O.P. 1166 113776 A.5. 11206 A/5. 851 Fa 265302 PC 17597 35851 SOTON/OQ 392090 315037 CA. 2343 371362 C.A. 33595 347068 315093 3101295 363291 113505 347088 PC 17318 1601 111240 382652 347742 STON/O 2. 3101280 17764 350404 4133 PC 17595 250653 LINE CA. 2343 SC/PARIS 2131 347077 230136 315153 113767 370365 111428 364849 349247 234604 28424 350046 230080 PC 17610 PC 17569 368703 4579 370370 248747 345770 CA. 2343 3101264 2628 A/5 3540 347054 3101278 2699 367231 112277 SOTON/O.Q. 3101311 F.C.C. 13528 A/5 21174 250646 367229 35273 STON/O2. 3101283 243847 11813 W/C 14208 SOTON/OQ 392089 220367 21440 349234 19943 PP 4348 SW/PP 751 A/5 21173 236171 4133 36973 347067 237442 347077 C.A. 29566 W./C. 6609 26707 C.A. 31921 28665 SCO/W 1585 2665 367230 W./C. 14263 STON/O 2. 3101275 2694 19928 347071 250649 11751 244252 362316 347054 113514 A/5. 3336 370129 2650 PC 17585 110152 PC 17755 230433 384461 347077 110413 112059 382649 C.A. 17248 3101295 347083 PC 17582 PC 17760 113798 LINE 250644 PC 17596 370375 13502 347073 239853 382652 C.A. 2673 336439 347464 345778 A/5. 10482 113056 349239 345774 349206 237798 370373 19877 11967 SC/Paris 2163 349236 349233 PC 17612 2693 113781 19988 PC 17558 9234 367226 LINE 226593 A/5 2466 113781 17421 PC 17758 P/PP 3381 PC 17485 11767 PC 17608 250651 349243 F.C.C. 13529 347470 244367 29011 36928 16966 A/5 21172 349219 234818 248738 CA. 2343 PC 17760 345364 28551 363291 111361 367226 113043 PC 17582 345764 PC 17611 349225 113776 16966 7598 113784 230080 19950 248740 244361 229236 248733 31418 386525 C.A. 37671 315088 7267 113510 2695 349237 2647 345783 113505 237671 330931 330980 347088 SC/PARIS 2167 2691 SOTON/O.Q. 3101310 370365 C 7076 110813 2626 14313 PC 17477 11765 3101267 323951 PC 17760 349909 PC 17604 C 7077 113503 2648 347069 PC 17757 2653 STON/O 2. 3101293 113789 349227 S.O.C. 14879 CA 2144 27849 367655 SC 1748 113760 350034 3101277 35273 PP 9549 350052 350407 28403 244278 240929 STON/O 2. 3101289 341826 4137 STON/O2. 3101279 315096 28664 347064 29106 312992 4133 349222 394140 19928 239853 STON/O 2. 3101269 343095 28220 250652 28228 345773 349254 A/5. 13032 315082 347080 370129 A/4. 34244 2003 250655 364851 SOTON/O.Q. 392078 110564 376564 SC/AH 3085 STON/O 2. 3101274 13507 113760 W./C. 6608 29106 19950 C.A. 18723 F.C.C. 13529 345769 347076 230434 65306 33638 250644 113794 2666 113786 C.A. 34651 65303 113051 17453 A/5 2817 349240 13509 17464 F.C.C. 13531 371060 19952 364506 111320 234360 A/S 2816 SOTON/O.Q. 3101306 239853 113792 36209 2666 323592 315089 C.A. 34651 SC/AH Basle 541 7553 110465 31027 3460 350060 3101298 CA 2144 239854 A/5 3594 4134 11967 4133 19943 11771 A.5. 18509 C.A. 37671 65304 SOTON/OQ 3101317 113787 PC 17609 A/4 45380 2627 36947 C.A. 6212 113781 350035 315086 364846 330909 4135 110152 PC 17758 26360 111427 C 4001 1601 382651 SOTON/OQ 3101316 PC 17473 PC 17603 349209 36967 C.A. 34260 371110 226875 349242 12749 349252 2624 111361 2700 367232 W./C. 14258 PC 17483 3101296 29104 26360 2641 2690 2668 315084 F.C.C. 13529 113050 PC 17761 364498 13568 WE/P 5735 347082 347082 2908 PC 17761 693 2908 SC/PARIS 2146 363291 C.A. 33112 17421 244358 330979 2620 347085 113807 11755 PC 17757 110413 345572 372622 349251 218629 SOTON/OQ 392082 SOTON/O.Q. 392087 A/4 48871 349205 349909 2686 350417 S.W./PP 752 11769 PC 17474 14312 A/4. 20589 358585 243880 13507 2689 STON/O 2. 3101286 237789 17421 28403 13049 3411 110413 237565 13567 14973 A./5. 3235 STON/O 2. 3101273 36947 A/5 3902 364848 SC/AH 29037 345773 248727 LINE 2664 PC 17485 243847 349214 113796 364511 111426 349910 349246 113804 SC/Paris 2123 PC 17582 347082 SOTON/O.Q. 3101305 367230 370377 364512 220845 347080 A/5. 3336 230136 31028 2659 11753 2653 350029 54636 36963 219533 13502 349224 334912 27042 347743 13214 112052 347088 237668 STON/O 2. 3101292 C.A. 31921 3101295 376564 350050 PC 17477 347088 1601 2666 PC 17572 349231 13213 S.O./P.P. 751 CA. 2314 349221 231919 8475 330919 365226 S.O.C. 14879 349223 364849 29751 35273 PC 17611 2623 5727 349210 STON/O 2. 3101285 S.O.C. 14879 234686 312993 A/5 3536 19996 29750 F.C. 12750 C.A. 24580 244270 239856 349912 342826 4138 CA 2144 PC 17755 330935 PC 17572 6563 CA 2144 29750 SC/Paris 2123 3101295 349228 350036 24160 17474 349256 1601 2672 113800 248731 363592 35852 17421 348121 PC 17757 PC 17475 2691 36864 350025 250655 223596 PC 17476 113781 2661 PC 17482 113028 19996 7545 250647 348124 PC 17757 34218 36568 347062 248727 350048 12233 250643 113806 315094 31027 36866 236853 STON/O2. 3101271 24160 2699 239855 28425 233639 54636 W./C. 6608 PC 17755 349201 349218 16988 19877 PC 17608 376566 STON/O 2. 3101288 WE/P 5735 C.A. 2673 250648 113773 335097 29103 392096 345780 349204 220845 250649 350042 29108 363294 110152 358585 SOTON/O2 3101272 2663 113760 347074 13502 112379 364850 371110 8471 345781 350047 S.O./P.P. 3 2674 29105 347078 383121 364516 36865 24160 2687 17474 113501 W./C. 6607 SOTON/O.Q. 3101312 374887 3101265 382652 C.A. 2315 PC 17593 12460 239865 CA. 2343 PC 17600 349203 28213 17465 349244 2685 345773 250647 C.A. 31921 113760 2625 347089 347063 112050 347087 248723 113806 3474 A/4 48871 28206 347082 364499 112058 STON/O2. 3101290 S.C./PARIS 2079 C 7075 347088 12749 315098 19972 392096 3101295 368323 1601 S.C./PARIS 2079 367228 113572 2659 29106 2671 347468 2223 PC 17756 315097 392092 1601 11774 SOTON/O2 3101287 S.O./P.P. 3 113798 2683 315090 C.A. 5547 CA. 2343 349213 248727 17453 347082 347060 2678 PC 17592 244252 392091 36928 113055 2666 2629 350026 28134 17466 CA. 2343 233866 236852 SC/PARIS 2149 PC 17590 345777 347742 349248 11751 695 345765 P/PP 3381 2667 7534 349212 349217 11767 230433 349257 7552 C.A./SOTON 34068 SOTON/OQ 392076 382652 211536 112053 W./C. 6607 111369 370376 330911 363272 240276 315154 3101298 7538 330972 248738 2657 A/4 48871 349220 694 21228 24065 W.E.P. 5734 SC/PARIS 2167 233734 2692 STON/O2. 3101270 2696 PC 17603 C 17368 PC 17598 PC 17597 PC 17608 A/5. 3337 113509 2698 113054 2662 SC/AH 3085 C.A. 31029 C.A. 2315 W./C. 6607 13236 2682 342712 315087 345768 1601 349256 113778 SOTON/O.Q. 3101263 237249 11753 STON/O 2. 3101291 PC 17594 370374 11813 C.A. 37671 13695 SC/PARIS 2168 29105 19950 SC/A.3 2861 382652 349230 348122 386525 PC 17608 349232 237216 347090 334914 PC 17608 F.C.C. 13534 330963 113796 2543 19950 382653 349211 3101297 PC 17562 113503 113503 359306 11770 248744 368702 2678 PC 17483 19924 349238 240261 2660 330844 A/4 31416 364856 29103 347072 345498 F.C. 12750 376563 13905 350033 19877 STON/O 2. 3101268 347471 A./5. 3338 11778 228414 365235 347070 2625 C 4001 330920 383162 3410 248734 237734 330968 PC 17531 329944 PC 17483 2680 2681 PP 9549 13050 SC/AH 29037 C.A. 33595 367227 13236 392095 368783 371362 350045 367226 211535 342441 STON/OQ. 369943 113780 4133 2621 349226 350409 2656 248659 SOTON/OQ 392083 CA 2144 CA 2144 113781 PC 17608 244358 17475 345763 17463 SC/A4 23568 113791 250651 11767 349255 3701 350405 347077 S.O./P.P. 752 PC 17483 347469 110489 SOTON/O.Q. 3101315 335432 2650 220844 343271 237393 315153 PC 17591 W./C. 6608 17770 7548 S.O./P.P. 251 2670 347072 2673 347077 29750 C.A. 33112 11778 230136 PC 17756 233478 PC 17756 113773 7935 PC 17558 239059 S.O./P.P. 2 A/4 48873 CA. 2343 28221 226875 111163 A/5. 851 235509 28220 347465 16966 347066 C.A. 31030 65305 36568 347080 PC 17757 26360 C.A. 34050 F.C. 12998 9232 28034 PC 17613 349250 C 4001 SOTON/O.Q. 3101308 S.O.C. 14879 24065 347091 113038 330924 36928 113503 32302 SC/PARIS 2148 342684 W./C. 14266 350053 PC 17606 2661 350054 370368 C.A. 6212 242963 220845 113795 3101266 330971 PC 17599 350416 110813 2679 250650 PC 17761 112377 237789 16966 3470 W./C. 6607 17464 F.C.C. 13534 28220 26707 2660 C.A. 34651 SOTON/O2 3101284 13508 7266 345775 C.A. 42795 AQ/4 3130 363611 28404 345501 345572 350410 29103 350405 C.A. 34644 349235 112051 C.A. 49867 A. 2. 39186 315095 13050 368573 13508 370371 2676 236853 SC 14888 2926 CA 31352 W./C. 14260 315085 SOTON/O.Q. 3101315 364859 2650 370129 A/5 21175 SOTON/O.Q. 3101314 21228 2655 A/5 1478 PC 17607 382650 2652 33638 345771 349202 SC/Paris 2123 2662 113801 347467 347079 237735 S.O./P.P. 2 315092 383123 112901 113781 392091 12749 350026 315091 2658 LP 1588 368364 PC 17760 AQ/3. 30631 PC 17569 28004 350408 C.A. 31029 347075 2654 244368 113790 24160 SOTON/O.Q. 3101309 230136 PC 17585 2003 236854 C.A. 33112 PC 17580 2684 2653 349229 110469 244360 2675 C.A. 31029 2622 C.A. 15185 350403 CA. 2343 PC 17755 A/5. 851 348125 237670 2688 248726 F.C.C. 13528 PC 17759 F.C.C. 13540 S.O.C. 14879 220845 C.A. 2315 113044 11769 1222 368402 349910 CA. 2343 S.C./PARIS 2079 CA 31352 315083 11765 CA. 2343 2689 3101295 112378 SC/PARIS 2147 28133 16966 112058 248746 33638 PC 17608 315152 29107 680 347077 366713 330910 364498 376566 SC/PARIS 2159 220845 349911 244346 364858 349909 12749 PC 17592 C.A. 2673 C.A. 30769 315153 13695 371109 13567 347065 21332 36928 28664 112378 113059 17765 SC/PARIS 2166 28666 113503 334915 SOTON/O.Q. 3101315 365237 19928 347086 A.5. 3236 PC 17758 SOTON/O.Q. 3101262 359309 2668 2668

P.S - I tried pasting this data as column, but couldn’t. Sorry for the inconvenience.

Please help!


#2

Hi @indutaneja11,

You can use the code below -

is_alphanumeric <- function(x) {
  b <- as.character(x)
  y <- suppressWarnings(as.numeric(b))
  z = ifelse(is.na(y) , 1,0)
  return(z)
}
x <- cbind(x,is_alphanumeric = as.factor(mapply(is_alphanumeric,x$Ticket)))

Not sure its the most optimized way. But it will work :smile:

Regards,
Aayush


#3

hi @indutaneja11,

You can try this

library(stringr)
#Extract numeric only
combi.num <- as.numeric(str_extract(combi$Ticket, "[0-9]+"))
# extract characters only
combi.cha <- (str_extract(combi$Ticket, "[aA-zZ]+"))  

Hope this helps!!


#4

Thanks @aayushmnit. Let me try.


#5

Hi @indutaneja11,

Sorry I missed a part of your query:
I believe this is from the Titanic Kaggle problem.
I have used the library stringr:

library(stringr)
# extract characters only
combi$Ticket.cha <- as.character(str_extract(combi$Ticket, "[aA-zZ]+"))
# If not alphanumeric,NA is assigned to Ticket.cha.
combi$TicketCode <- ifelse(is.na(combi$Ticket.cha),0,1)
# Assign 0 where numeric and 1 where alphanumeric. 

The result:

Hope this helps!!