OK, so I've learned from this paper that this sort of paid postings to influence public opinion occur systematically by trained teams in China. But my praises end there.
I don't trust much their machine learning detection approach. But before we get into that, let's describe what they did. These researchers collected about 21,000 comments from a news site representing 552 users. The researchers then manually selected which of these 552 users they thought to be paid posters without any external confirmation. According to their methodology they assumed stupid comments or contradictory comments were by paid posters. The approach from here was to do lots of fancy math on novel comments not screened by the researchers, and make a classification on if these novel comments were by paid posters or not.
If you haven't guessed it by now, the flaw here is the researchers assigning which commenters they thought were paid or not, based on the stupidity of the comment. Its pretty easy to manually classify e-mail spam, but I'd have a hard time classifying a paid public opinion shill based on a comment. Furthermore, if the researchers are using the intelligence of the comment as a marker, my experience with youtube comments, r/politics, and a few other internet forums leads me to believe that there's no shortage of stupid, contradictory ideas espoused by real unpaid people.
Sorry about following your OT here, but it's a very interesting thought - That the great firewall might be a factor in over-representing the uniqueness of your own work if prior art is censored from your view.
This reminds me of a video I saw a while back where they showed Chinese students the iconic "man in front of a line of tanks" picture and they failed to recognize it. I wonder whether irony will eat itself in the future - maybe some "capitalist" country will decide to use tanks on its people and they will try to stop it with a human shield. Cue "look at the failure of capitalism - using tanks on their own people who bravely stand up to it!" comments from Chinas statesmen.
Shielding your citizens from material that encourages dissenting thought is a cute plan, but you may end up making them look like fools on the international stage.
In the USA, it is now law that bloggers have to mention any gifts of products, books, licenses, etc. that might influence blog posts.
I think this is good!
I am an author, and I get comped a lot of books, and some software products. It just feels right to say something like, for example, "thanks to publisher Z for sending me a revue copy of book B" when talking about book B. If I get comped something that I don't like, then I won't talk about it.
Paid posters on Reddit, etc., are more insidious because you just have to guess if they might be paid by a company or government to push desired hype.
not actually a "law" -- more of an F.T.C. regulation
"The Guides are administrative interpretations of the law intended to help advertisers comply with the Federal Trade Commission Act; they are not binding law themselves." -- http://www.ftc.gov/opa/2009/10/endortest.shtm
§ 255.5 Disclosure of material connections.
When there exists a connection between the endorser and
the seller of the advertised product that might materially
affect the weight or credibility of the endorsement ( i.e.,
the connection is not reasonably expected by the audience),
such connection must be fully disclosed.
Relevant example:
Example 7: A college student who has earned a reputation as
a video game expert maintains a personal weblog or “blog”
where he posts entries about his gaming experiences. Readers
of his blog frequently seek his opinions about video game
hardware and software. As it has done in the past, the
manufacturer of a newly released video game system sends the
student a free copy of the system and asks him to write about
it on his blog. He tests the new gaming system and writes a
favorable review. Because his review is disseminated via a
form of consumer-generated media in which his relationship to
the advertiser is not inherently obvious, readers are
unlikely to know that he has received the video game system
free of charge in exchange for his review of the product, and
given the value of the video game system, this fact likely
would materially affect the credibility they attach to his
endorsement. Accordingly, the blogger should clearly and
conspicuously disclose that he received the gaming system
free of charge. The manufacturer should advise him at the
time it provides the gaming system that this connection
should be disclosed, and it should have procedures in place
to try to monitor his postings for compliance.
from the geographical distribution of users vs. paid posters on pg.7 the 2 provinces - SICHUAN and esp. SHANDONG - are noticeable by lower ratio of paid posters. Any idea why it so?
I suspect quite a bit of astroturfing is going on in app store reviews. My national iOS app store has a fairly small volume of ratings/reviews, usually single or double digits, only hitting hundreds for a very few apps. I've spotted two or three weird scattershots of desultory, ill-fitting, generic-sounding reviews. It even looked like the effort needed was overestimated out of ignorance ...
I don't trust much their machine learning detection approach. But before we get into that, let's describe what they did. These researchers collected about 21,000 comments from a news site representing 552 users. The researchers then manually selected which of these 552 users they thought to be paid posters without any external confirmation. According to their methodology they assumed stupid comments or contradictory comments were by paid posters. The approach from here was to do lots of fancy math on novel comments not screened by the researchers, and make a classification on if these novel comments were by paid posters or not.
If you haven't guessed it by now, the flaw here is the researchers assigning which commenters they thought were paid or not, based on the stupidity of the comment. Its pretty easy to manually classify e-mail spam, but I'd have a hard time classifying a paid public opinion shill based on a comment. Furthermore, if the researchers are using the intelligence of the comment as a marker, my experience with youtube comments, r/politics, and a few other internet forums leads me to believe that there's no shortage of stupid, contradictory ideas espoused by real unpaid people.