Widgets
Data
Transform
Visualize
Model
Evaluate
Unsupervised
Spectroscopy
Text Mining
- Corpus
- Import Documents
- Create Corpus
- The Guardian
- NY Times
- Pubmed
- Twitter
- Wikipedia
- Preprocess Text
- Corpus to Network
- Bag of Words
- Document Embedding
- Similarity Hashing
- Sentiment Analysis
- Tweet Profiler
- Topic Modelling
- LDAvis
- Corpus Viewer
- Score Documents
- Word Cloud
- Concordance
- Document Map
- Word Enrichment
- Duplicate Detection
- Word List
- Extract Keywords
- Annotated Corpus Map
- Ontology
- Semantic Viewer
- Collocations
- Statistics
Survival Analysis
Bioinformatics
Single Cell
Image Analytics
Networks
Geo
Educational
Time Series
Associate
Explain
Unique
Remove duplicated data instances.
Inputs
- Data: data table
Outputs
- Data: data table without duplicates
The widget removes duplicated data instances. The user can choose a subset of observed variables, so two instances are considered as duplicates although they may differ in values of other, ignored variables.
- Select the variables that are considered in comparing data instances.
- Data instance that is kept. The options are to use the first, last, middle or random instance, or to keep none, that is, to remove duplicated instances altogether.
Example
Data set Zoo contains two frogs. This workflow keeps only one by removing instances with the same names.