python - Why some tweets are in search api and not in streaming api and vice versa -


i have script stores incoming tweets phrase (e.g. "python") database table "a" using twitter streaming api. later, script searches same phrase using twitter search api , stores results table "b". question why there tweets in "a" not in "b" , vice versa.

i can think of 1 reason have tweets in "b" , not in "a":

"a" contains tweets posted after streaming api started while search api returns results last week. if streaming api has been running more week, there must not tweet in "b" not in "a".

i know 2 reasons have tweets in "a" , not in "b":

  1. search api returns results last week while streaming api returns everything
  2. search api returns portion of results , not focus not on completeness.

i'd make sure if got correct or not.

for "b" not in "a" correct. big indication of search api link included:

it allows queries against indices of recent or popular tweets...

for "a" not in "b" you're correct minor mistakes.

  1. the streaming api not return everything, return 1% of total tweets. 1% filter done internally in twitter , there has not been indication on how it's done. there has been annoucement not long ago fixing 1% make true 1%, can't seem find link read at.
  2. with streaming api you're impaired (more commonly):
    • public stream limit (reaching 1%)
    • stall warnings (warning)

few others depending on use https://dev.twitter.com/streaming/overview/messages-types


Comments

Popular posts from this blog

java - Date formats difference between yyyy-MM-dd'T'HH:mm:ss and yyyy-MM-dd'T'HH:mm:ssXXX -

c# - Get rid of xmlns attribute when adding node to existing xml -