Releases: trinker/qdap
qdap Version 2.2.0
NEWS
Versioning
Releases will be numbered with the following semantic versioning format:
<major>.<minor>.<patch>
And constructed with the following guidelines:
- Breaking backward compatibility bumps the major (and resets the minor
and patch) - New additions without breaking backward compatibility bumps the minor
(and resets the patch) - Bug fixes and misc. changes bumps the patch
CHANGES IN qdap VERSION 2.2.0
BUG FIXES
bag_o_wordsdid not make use of thebag_o_words2helper function that has
finer grained control of the output....were ignored but now are respected.frythrew an error if a group contained < 300 words but had enough text to
generate 2 texts chunks of 100 words each, caught by S. Enrico P. Indiogine.
The bug has been fixed as these groups are dropped and a warning given.phrase_netthrew an error caused by dplyr's (0.3) approach to subsetting
columns. Proviously a vector was returned, now atbl_dfobject is returned:
tidyverse/dplyr#587. This was adtreeded by using
explicitdf[[index]]rather thandf[, index].
NEW FEATURES
chunkeradded to break text, optionally by grouping variables, into equal
chunks. The chunk size can be specified by giving number of words to be in
each chunk or the number of chunks.
IMPROVEMENTS
all_words gains char.keep and char2space arguments to enable retention
of characters and multi word phrases. These features are passed to
freq_terms as well. Suggestd by stackoverflow's lawyeR
(http://stackoverflow.com/a/26162401/1000343).
CHANGES
rm_urlhas been moved into its own canned regex pattern extraction/replacer
package namedqdapRegex.name2sexnow uses the gender package to predict sex. This makes the
function slightly slower but much more accurate than previous versions.
Because of this increased accuracy and dependence ongender, the arguments
pred.sex,fuzzy.match, anddatabaseare no longer necessary and have
been removed.
CHANGES IN qdap VERSION 2.1.1
BUG FIXES
syllable_countreturned the sentence (recycled) in thewordscolumn of the
output. This behavior has been fixed. See GitHub issue #188 for details.synreturned antonyms for some words. This was caused by the dictionary:
qdapDictionaries::key.syncontained antonyms and elemets the were error
messages (character). This has been fixed. Reference issue #190. (Jingjing Zou)- The
pres_debates2012data set contained three errors in speech attribution.
This has been corrected and the turn of talk (tot) as well. word_statswould throw an error if no poly-syllable words existed. This has
been corrected (reported by Nicolas Turenne).
NEW FEATURES
qdap_dfand%&%added to mimic some of the functionality ofdplyr's
tbl_dfand chaining pipe in a more specific, less flexible,qdaporiented
way.Textadded to view and change thetext.varattribute of adata.frame of the classqdap_df`.cumulativegeneric method added to view cumulative scores over time.formalitypicks up acumulativemethod.polaritypicks up acumulativemethod.end_markpicks up aclass(end_mark),plotmethod, and acumulative
method.syllable_sum,polysyllable_sum, andcombo_syllable_sumpick up a
class,plotmethod, and acumulativemethod.wfmbecomes a generic method currently applied to atext.varthat is:
character,factor(coerced tocharacter), orwfdf.unbagadded as a compliment tobag_o_wordsand friends for undoing string
splitting. A convenience wrapper forpaste(collapse = " ").as.Corpus.TermDocumentMatrix,as.Corpus.DocumentTermMatrix, and
as.Corpus.wfmadded to convert a matrix format to atm::Corpus.excludebecomes a generic method for various classes. Functionality is the
same but with improved code readability.check_spelling_interactive,check_spelling,which_misspelled, and
correctallow the user to identify potentially misspelled words and
optionally suggest replacements.random_data&random_sentadded to generate random sentence data sets and
vectors.comma_spaceradded to ensure strings with commas contain a space after them.check_textadded to identify potential problems in text.replace_ordinaladded to convert ordinal representations of 1 through 100 to
strictly ordinal text (e.g., "1st" becomes "first").- A vignette:
Cleaning Text & Debuggingwas added to assist users with
cleaning and debugging problems inqdap. pronoun_type, andsubject_pronoun_type,object_pronoun_typeadded to
examine usage of subject/object pronouns by grouping variable.
MINOR FEATURES
dplyr's chaining pipe imported for convenience. See
http://www.rdocumentation.org/packages/magrittr/functions/magrittr for details.
IMPROVEMENTS
wfmgains a speedup through generic classes andtmpackage integration
(stripis no longer used inwfm).as.tdm.characterandas.dtm.charactergain a speed boost with atm
package integration.- Added message to
as.data.frame.Corpusfor missing end-marks suggesting the
use of:sent.split = FALSE. as.Corpusfamiliy of functions didn't necessarily respect document names and
sometimes used numeric sequence instead. The introduction of a reader via
tm::readTabularhas fixed this.sentSplitnow gives warnings for text that may contain anomalies such as:
non-ASCII characters, factors, missing punctuation, empty cells, and no
alphabetic characters found.read.transcriptnow gives a warning when reading from a .docx file and the
separator (sep) used is still found in the text as this may indicate the
data did not split correctly.dispersion_plotnow takes a named list of vectors of terms as the argument to
match.terms. The vectors are combined as a unified theme named with the
names of the list supplied tomatch.terms.
CHANGES
as.data.frame.Corpus's default value forsent.splitis nowFALSE.- The
statecolumn in theqdap::DATA2data-set is now character (previously
factor).
CHANGES IN qdap VERSION 2.1.0
BUG FIXES
new_projectdid not copy the .Rprofile over into the new project. This has
been fixed. Reference issue #184.sentiment_framecoerced words to factor.stringsAsFactors = FALSEhas
been added to prevent this.polaritydid not work on > 1 grams due to a bug insentiment_frame
converting character to factor (thanks for the find @chewth). See GitHub
issue #185 for details.
NEW FEATURES
unique_byadded to allow the user to find terms unique to individual
elements of a grouping variable.build_qdap_vignettereplaces the temporary place holder version of the
Introduction to qdap vignette. This function will replace the (1) HTML,
(2) source, & (3) R code found inbrowseVignettes(package = 'qdap').
MINOR FEATURES
sub_holderpicks up aalpha.typeargument that allows the user to specify
whether alpha or numeric keys should be used.replace_numberpicks up aremoveargument that removes numbers from text.
IMPROVEMENTS
qheatbecomes a generic method. This means some of the internal function
class checking has been moved to individual methods for those classes.
Additionally,qheatnow works with logical matrices/data.frames.- The
tmpackage compatibility functions have been renamed in a more R-ish
way and take the form of generic methods for specific classes. For example,
df2tm_corpusbecomesas.Corpus. Here is a complete list of changes:df2tm_courpusis nowas.Corpustm_corpus2dfis nowas.data.frameas.wfmis now a generic methodtm_corpus2wfmis nowas.wfmtm2qdapis nowas.wfmtdmis nowas.tdmoras.TermDocumentMatrixdtmis nowas.dtmoras.DocumentTermMatrix
CHANGES
colsplit2dfandcolpaste2dfno longer convert character columns to factor.df2tm_corpusis deprecated. It will be removed in a subsequent version of
qdap. Useas.Corpusinstead.tm_corpus2dfis deprecated. It will be removed in a subsequent version of
qdap. Useas.data.frameinstead.tm2qdapis deprecated. It will be removed in a subsequent version of
qdap. Useas.wfminstead.tm_corpus2wfmis deprecated. It will be removed in a subsequent version of
qdap. Useas.wfminstead.tdmis deprecated. It will be removed in a subsequent version ofqdap.
Useas.tdmoras.TermDocumentMatrixinstead.dtmis deprecated. It will be removed in a subsequent version ofqdap.
Useas.dtmoras.DocumentTermMatrixinstead.- The Introduction to qdap .Rmd vignette has been moved to an internal
directory. The HTML version is not built by default. This saves CRAN space
and time checking the package source. The file has been replaced with a
temporary place holder that contains instructions for building the actual
vignette. The user may also use the `bui...
qdap Version 2.1.1
CHANGES IN qdap VERSION 2.1.1
BUG FIXES
syllable_countreturned the sentence (recycled) in thewordscolumn of the
output. This behavior has been fixed. See GitHub issue #188 for details.synreturned antonyms for some words. This was caused by the dictionary:
qdapDictionaries::key.syncontained antonyms and elemets the were error
messages (character). This has been fixed. Reference issue #190. (Jingjing Zou)- The
pres_debates2012data set contained three errors in speech attribution.
This has been corrected and the turn of talk (tot) as well. word_statswould throw an error if no poly-syllable words existed. This has
been corrected (reported by Nicolas Turenne).
NEW FEATURES
qdap_dfand%&%added to mimic some of the functionality ofdplyr's
tbl_dfand chaining pipe in a more specific, less flexible,qdaporiented
way.Textadded to view and change thetext.varattribute of adata.frame of the classqdap_df`.cumulativegeneric method added to view cumulative scores over time.formalitypicks up acumulativemethod.polaritypicks up acumulativemethod.end_markpicks up aclass(end_mark),plotmethod, and acumulative
method.syllable_sum,polysyllable_sum, andcombo_syllable_sumpick up a
class,plotmethod, and acumulativemethod.wfmbecomes a generic method currently applied to atext.varthat is:
character,factor(coerced tocharacter), orwfdf.unbagadded as a compliment tobag_o_wordsand friends for undoing string
splitting. A convenience wrapper forpaste(collapse = " ").as.Corpus.TermDocumentMatrix,as.Corpus.DocumentTermMatrix, and
as.Corpus.wfmadded to convert a matrix format to atm::Corpus.excludebecomes a generic method for various classes. Functionality is the
same but with improved code readability.check_spelling_interactive,check_spelling,which_misspelled, and
correctallow the user to identify potentially misspelled words and
optionally suggest replacements.random_data&random_sentadded to generate random sentence data sets and
vectors.comma_spaceradded to ensure strings with commas contain a space after them.check_textadded to identify potential problems in text.replace_ordinaladded to convert ordinal representations of 1 through 100 to
strictly ordinal text (e.g., "1st" becomes "first").- A vignette:
Cleaning Text & Debuggingwas added to assist users with
cleaning and debugging problems inqdap. pronoun_type, andsubject_pronoun_type,object_pronoun_typeadded to
examine usage of subject/object pronouns by grouping variable.
MINOR FEATURES
dplyr's chaining pipe imported for convenience. See
http://www.rdocumentation.org/packages/magrittr/functions/magrittr for details.
IMPROVEMENTS
wfmgains a speedup through generic classes andtmpackage integration
(stripis no longer used inwfm).as.tdm.characterandas.dtm.charactergain a speed boost with atm
package integration.- Added message to
as.data.frame.Corpusfor missing end-marks suggesting the
use of:sent.split = FALSE. as.Corpusfamiliy of functions didn't necessarily respect document names and
sometimes used numeric sequence instead. The introduction of a reader via
tm::readTabularhas fixed this.sentSplitnow gives warnings for text that may contain anomalies such as:
non-ASCII characters, factors, missing punctuation, empty cells, and no
alphabetic characters found.read.transcriptnow gives a warning when reading from a .docx file and the
separator (sep) used is still found in the text as this may indicate the
data did not split correctly.dispersion_plotnow takes a named list of vectors of terms as the argument to
match.terms. The vectors are combined as a unified theme named with the
names of the list supplied tomatch.terms.
CHANGES
as.data.frame.Corpus's default value forsent.splitis nowFALSE.- The
statecolumn in theqdap::DATA2data-set is now character (previously
factor).
qdap Version 2.1.0
CHANGES IN qdap VERSION 2.1.0
BUG FIXES
new_projectdid not copy the .Rprofile over into the new project. This has
been fixed. Reference issue #184.sentiment_framecoerced words to factor.stringsAsFactors = FALSEhas
been added to prebent this.polaritydid not work on > 1 grams due to a bug insentiment_frame
converting character to factor (chewth). See GitHub issue #185 for details.
NEW FEATURES
unique_byadded to allow the user to find terms unique to individual
elements of a grouping variable.build_qdap_vignettereplaces the temporary place holder version of the
Introduction to qdap vignette. This function will replace the (1) HTML,
(2) source, & (3) R code found inbrowseVignettes(package = 'qdap').
MINOR FEATURES
sub_holderpicks up aalpha.typeargument that allows the user to specify
whether alpha or numeric keys should be used.replace_numberpicks up aremoveargument that removes numbers from text.
IMPROVEMENTS
qheatbecomes a generic method. This means some of the internal function
class checking has been moved to individual methods for those classes.
Additionally,qheatnow works with logical matrices/data.frames.- The
tmpackage compatibility functions have been renamed in a more R-ish
way and take the form of generic methods for specific classes. For example,
df2tm_corpusbecomesas.Corpus. Here is a complete list of changes:df2tm_courpusis nowas.Corpustm_corpus2dfis nowas.data.frameas.wfmis now a generic methodtm_corpus2wfmis nowas.wfmtm2qdapis nowas.wfmtdmis nowas.tdmoras.TermDocumentMatrixdtmis nowas.dtmoras.DocumentTermMatrix
CHANGES
colsplit2dfandcolpaste2dfno longer convert character columns to factor.df2tm_corpusis deprecated. It will be removed in a subsequent version of
qdap. Useas.Corpusinstead.tm_corpus2dfis deprecated. It will be removed in a subsequent version of
qdap. Useas.data.frameinstead.tm2qdapis deprecated. It will be removed in a subsequent version of
qdap. Useas.wfminstead.tm_corpus2wfmis deprecated. It will be removed in a subsequent version of
qdap. Useas.wfminstead.tdmis deprecated. It will be removed in a subsequent version ofqdap.
Useas.tdmoras.TermDocumentMatrixinstead.dtmis deprecated. It will be removed in a subsequent version ofqdap.
Useas.dtmoras.DocumentTermMatrixinstead.- The Introduction to qdap .Rmd vignette has been moved to an internal
directory. The HTML version is not built by default. THis saves CRAN space
and time checking the package source. The file has been replaced with a
temporary place holder that contains instructions for building the actual
vignette. The user may also use thebuild_qdap_vignettedirectly. qdapincorporates the chanegs from thetmpackage version: 0.6:
http://cran.r-project.org/web/packages/tm/news.html Reference issue #187.
qdapTools Version 2.0.0.b
CHANGES IN qdap VERSION 2.0.0
The qdapTools package now houses several former qdap functions. While
qdapTools is a Dependency and all of these functions will be accessible to
the qdap user there is a break in backward compatibility if these functions
are included in code. For this reason this release is a major bump of qdap.
BUG FIXES
replace_numberdid not replace single digits numbers. Spotted by Ben Bolker.
This behavior has been fixed and unit testing added for this function. See
issue #178.
NEW FEATURES
sub_holderadded; this function holds the place for particular character
values, allowing the user to manipulate the vector and then revert the place
holders back to the original values.Networkmethod added to make network plots of select qdap objects.qtheme,theme_nightheat,theme_duskheat, theme_norah,theme_cafe,theme_grayscale,theme_badkitchen, andtheme_hipsteradded to styleNetwork` plots.polaritypicks up aNetworkmethod.formalitypicks up aNetworkmethod.- qdap officially begins utilizing the
testthatpackage for unit testing,
though only a few functions have begun the process, more will be added over
time.
MINOR FEATURES
IMPROVEMENTS
CHANGES
- The
qdapToolspackage now houses the following formerqdapfunctions:
hash,%ha%,hash_look,hms2sec,id,lookup,%l%,%l+%,%l*%,
repo2github,sec2hms,text2color,url_dl,v_outer,list2df,
matrix2df,vect2df,list_df2df,list_vect2df,counts2list,
vect2list, &mtabulate. These functions will continue to be available to
qdap users in interactive mode (qdapToolsis a Dependency and thus these
functions are loaded into the workspace by default). This will allow this
bundle of functions to be used outside of qdap without calling the larger qdap
package per the request of Kirill Muller (see issue #165). - As schedulaed the
dissimialrityfunction has been removed from the qdap
package to avoid conflict with thetmpackage. UseDissimilarityfunction
instead.
qdap Version 2.0.0
Initial 2.0.0 bump:
CHANGES IN qdap VERSION 2.0.0
The qdapTools package now houses several former qdap functions. While
qdapTools is a Dependency and all of these functions will be accessible to
the qdap user there is a break in backward compatability if these functions
are included in code. For this reason this release is a major bump of qdap.
CHANGES
- The
qdapToolspackage now houses the following formerqdapfunctions:
hash,%ha%,hash_look,hms2sec,id,lookup,%l%,%l+%,%l*%,
repo2github,sec2hms,text2color,url_dl,v_outer. These functions
will continue to be available toqdapusers in interactive mode (qdapTools
is a Dependency and thus these functions are loaded into the workspace by
default). This will allow this bundle of functions to be used outside of
qdap without calling the larger qdap package per the request of Kirill Muller
(see issue #165). - The
dissimialrityfunction has been removed from the qdap package to avoid
conflict with thetmpackage. UseDissimilarityfunction instead.
qdap Version 1.3.6
CHANGES IN qdap VERSION 1.3.6
MINOR FEATURES
polaritypicks up aconstrainargument that constrains the polarity values
to be between -1 and 1.
IMPROVEMENTS
polarity's equation now uses primes on the de-amplifiers before they're
confined to be >= -1. This avoids confusion in the indicator function that
took the de-amplifiers variable and returned the same variable.dist_tab's frequency columns used a capital F in Freq. This was not
consistent across all column names and has been changed to lower case.
CHANGES
polarity_frameis deprecated and will be removed in a subsequent release.
Please usesentiment_frameinstead.