• Bronzebeard@lemmy.zipBanned
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    2 days ago

    If you can create a tool that accurately identifies what is AI generated, then you’ve just created a tool that can be used to train AI to trick it.

    This is essentially how many types of models are trained, already.

    • Lvxferre [he/him]@mander.xyz
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 days ago

      The core argument of the text isn’t even arms race, like yours. It’s basically “if you can’t get it 100% accurate then it’s pointless lol lmao”. It’s simply a nirvana fallacy; on the same level of idiocy as saying “unless you can live forever might as well die as a baby”.


      With that out of the way, addressing your argument separately: the system doesn’t need to be 100% accurate, or perfectly future-proof, to be still useful. It’s fine if you get some false positives and negatives, or if you need to improve it further to account for newer models evading detection.

      Accuracy requirements depend a lot on the purpose. For example:

      • you’re using a system to detect AI “writers” to automatically permaban them - then you need damn high accuracy. Probably 99.9% or perhaps even higher.
      • you’re using a system to detect AI “writers”, and then manually reviewing their submissions before banning them - then the accuracy can be lower, like 90%.
      • you aren’t banning anyone, just trialling what you will / won’t read - then 75% accuracy is probably enough.

      I’m also unsure if it’s as simple as using the detection tool to “train” the generative tool. Often I notice LLMs spouting nonsense the same model is able to call out afterwards as nonsense; this hints that generating content with certain attributes is more complex than detecting if some content lacks them.