Identifying Scammy Cryptos and ICOs – Approach

A wise man proportions his belief to the evidence.” – David Hume, Scottish philosopher

As crypto researchers, we believe the community cannot go mainstream without having enough controls to prevent common investors from getting duped by scammy ICOs and coins. Our motive is to detect scams and warn investors, using machine learning and crowdsourcing techniques.

Our Approach
Comparing cryptos are not easy, given we don’t have enough data points to classify a crypto, or an ICO as scam or not. We follow a similarity based approach where we quantify how good a crypto is with regards to a reference. This way apart from scoring using weights, we can get a similarity score to the reference in a vector space model. A standard way of doing this is by using cosine similarity. Our approach gives two outputs, overall score, and similarity score with regards to reference. Both are between 0-1, where 1 indicates the maximum score. Our assumption is these two values should give enough evidence on how good a crypto is. We collectively chose Ethereum to be the reference point, as it has everything that we are looking forward to in terms of dimensions(utility, team, technology etc.).

Features
We are going to use the below listed features as dimensions to project crytpo into a vector space to compare similarity

  • Utility – Usefulness of crypto or ICO
  • Team – Team behind the crypto
  • Partnership – How well connected is the team and company to the outside industry
  • Technology:Implementation– Code check-ins, quality of checked in code, testnets, nodes etc.
  • Technology:Whitepaper– Quality of whitepaper(detect plagarism, vapourware promises etc.)
  • Community – Size of the community across BCT, reddit, twitter, telegram and more. Sentiment of the overall community using sentiment analysis model.
  • Price Data Time Series– Pump and Dump, and other manipulation techniques
  • Final Model
    T = Team
    U = Utility
    P = Partnership
    W = Technology – Whitepaper
    I = Technology – Implementation(Code check-ins etc.)
    C = Community
    S = Price Data TimeSeries – Pump and Dump, and other manipulation

    Max Weights – We’ll re-tune them based on feedback and once we have enough training data
    [0.2,0.2,0.2,0.1,0.1,0.1,0.1]
    [w1,w2,w3,w4,w5,w6,w7] –


    Ethereum –
    [1,1,1,1,1,1,1] => This would be community generated by giving enough evidences on our subreddit /r/shamcoin/. We will use the evidences to come up with the final vector.

    Final Score = Tw1 + Uw2 + Pw3 + Ww4 + Iw5 + Cw6 + Sw7
    For ethereum:
    Final score would be 1, since all the feature scores are 1.
    Similarity Score would be 1 as well, because the vectors are exactly same, and cos(0) is 1.

    Next Step – Rank every cryptos and ICOs with this model!

    Leave a Reply

    Your email address will not be published. Required fields are marked *