
For example, you can visualize the similarities and differences across users in a: Text content language & stop words Visualize patterns in social media datasetsĬluster analysis enables you to compare similarity of words in social media datasets. For examples of what words might be appropriate, take a look at the existing stop words provided in other languages. Add appropriate Turkish words to the Stop Words list.Set the text content language to ‘Other’.To check which stop words apply to your content, you can view the Stop Words list.įor example, if you are working with content in Turkish, you might like to: This will reduce the chance that documents will have a high similarity coefficient based predominantly on these words. When you are working with content in other languages, stop words will improve the outcome of your cluster analysis by excluding similarity based on words which convey less meaning. The language used in your data has no impact on the results for cluster analysis by coding or attribute value similarity.įor cluster analysis by word similarity, NVivo will exclude any defined ‘stop words’ from the similarity calculation. How are cluster analysis diagrams generated? Working with data in other languages When you create a cluster analysis diagram using the Cluster Analysis Wizard, you can choose from the following similarity metrics: Values are clustered together on the cluster analysis diagram.įiles or codes that have different attribute values are displayedįurther apart on the cluster analysis diagram.Ī similarity metric is a statistical method used to calculate correlation between items. The attribute values of the selected files or codes are compared.įiles or codes that have similar attribute Or codes that have been coded differently are displayed further The coding to the selected files or codes isįiles or codes that have been coded similarlyĪre clustered together on the cluster analysis diagram. Stop words are excluded when using this measure On the occurrence and frequency of words are displayed further On the occurrence and frequency of words are shown clustered together.įiles or codes that have a lower degree of similarity based The words contained in the selected files orįiles or codes that have a higher degree of similarity based The files or codes in a cluster analysis diagram, can be clustered by word similarity, coding similarity or attribute value similarity. Cluster by word, coding or attribute value similarity This type of cluster analysis diagram displays the most frequently occurring words in the selected files or codes. You can also view Word Frequency query results as a cluster analysis diagram.


The Summary tab displays the similarity index values used to generate the diagram.ġ Items compared-each possible pair of selected items is listed as a row in the table.Ģ Similarity Index-displays a value that indicates the degree of similarity for each pair of items based on the similarity metric selected. Items with a high similarity index (maximum=1) indicate a strong similarity and are displayed closer together on the cluster analysis diagram.

The Diagram tab displays the visual representation of your data.
