brookst an hour ago

Macrumors is such a trash site. This article is Macrumors adding approximately nothing to this TechCrunch article: https://techcrunch.com/2024/09/30/meta-wont-say-whether-it-t...

Note that the tiny bit Macrumors added is converting TechCrunch’s accurate “Meta declined to say” into a claim of probability.

TechCrunch has updated the story [1] with concrete answers: Meta trains when the user asks for recognition of an image, but not passively in the background.

1. https://techcrunch.com/2024/10/02/meta-confirms-it-may-train...

simonw an hour ago

> TechCrunch doesn't come out and say it, but if the answer is not a clear and definitive "no," it's likely that Meta does indeed plan to use images captured by the Meta Glasses to train Meta AI. If that wasn't the case, it doesn't seem like there would be a reason for Meta to be ambiguous about answering, especially with all of the public commentary on the methods and data that companies use for training.

I have a slightly different interpretation of this: I think Meta want to keep their options open.

I think that’s true of many of these “will they train on your data?” stories.

People tend to over-estimate the value of their data for training. AI labs are constantly looking for new sources of high quality data - but quality really matters to them. Random junk people feed into the models is right at the bottom of that quality list.

But what happens if Meta say “we will never train on this data”… and then next week a researcher comes up with some new training technique that makes that data 10x more valuable than when they made that decision not to use it?

Safer for them to not make concrete promises that they can’t back out of later if it turns out the data was more valuable than they initially expected.

  • visarga 17 minutes ago

    > AI labs are constantly looking for new sources of high quality data

    OpenAI has 200M users, and solve over 1B tasks per month interactively. This amounts to 1-2 trillion mixed human/AI tokens. The fact is that every user has their own unique life experience, and a reservoir of tacit knowledge they didn't communicate or write down anywhere else. The LLM can elicit that tacit knowledge that would be otherwise lost, it can crawl our minds for ideas and problem solving choices.

    LLMs are in a situation of indirect agency. When they propose a solution, the human usually takes it out and implements it in the real world, and comes back for more help communicating the outcomes. Across many sessions it becomes possible to check what AI ideas worked out and what ideas were bad. This is a huge resource, it collects experience from every user. The LLM becomes an experience flywheel, people are attracted to the best models, and they will get the lion share of this experience.

    And yes, you can do it with privacy in mind. You can train just a preference model instead of supervised training on chat logs. Just a model that would pick the right answer from a lineup. This way PII and user specifics don't leak.

  • meroes an hour ago

    I don’t think it’s about junk vs not junk here. It’s whether this is novel enough data compared to the billions of images and hours they already possess. Since others have already said these are fundamentally different than pointing and shooting, such as capturing downtime in the home or times where a traditional photo might be inappropriate, it is novel.

  • fraboniface an hour ago

    Videos streams from everyday life sound like extremely high quality data for training AI. Videos available on the internet are a very biased sample compared to that.

  • nbardy an hour ago

    Ray bans won’t be random junk. It will be incredible end task data seeing how humans perform tasks in arbitrary homes.

  • freedomben an hour ago

    I agree, we shouldn't be jumping to conclusions. Especially with flawed logic/reasoning like this:

    > it doesn't seem like there would be a reason for Meta to be ambiguous about answering, especially with all of the public commentary on the methods and data that companies use for training.

    There is another possible reason and it's not that hard to think of: they want to keep their options open. It's also possible that they don't want to play the game of "anything not explicitly

    My guess is that they are currently (or will in the future be) training on it, but I don't think we should take these statements as evidence of that.

    In general I think big tech is atrocious with private information, and if the average person knew the depth of the data they have, they would not stand for it. Certainly Meta doesn't have a great track record on the data front so there's no reason to think they'd be different. Unfortunately the average person thinks people like me are paranoid and/or crazy when I try to tell them about it. At best they just feel powerless and shrug and keep using the product anyway.

  • unshavedyak an hour ago

    I have a feeling that random junk will get valuable again when AI can augment it at scale. Which I’m sure already happens, I’m just assuming it’ll get more extreme.

Shank an hour ago

The glasses actually have two modes where you can take pictures. First, you can take pictures with the capture button or just asking for a picture. Second, you can ask Meta AI to look and tell you something about what it sees. You can say “Hey Meta, look at this and tell me what it is” and then it goes off to the AI cloud to get an answer.

I would assume that since you can enable AI and “improve the AI” that that’s data that’s fair game for training.

But when you take photos using the capture button or saying “Hey Meta, take a photo” the photos don’t even use cloud storage by default. You have to specifically turn on Meta’s temporary cloud storage feature to sync data while charging with your iPhone to work around iOS rules. If you don’t, those photos are just local only.

There are instances where questions can be answered by just using the product and reading the legal documents. I think this is clearly just lackluster research from TechCrunch carried to MacRumors.

rchaud an hour ago

Proof positive that paying for a product doesn't pre-empt the vendor from collecting your data anyway to monetize something else.

The real question is, who's paying $500 for the privilege of being a willing mule for Zuck's surveillance empire building dreams?

  • dylan604 an hour ago

    At this point, if it has the Meta name on it, you'd be pretty safe in thinking that collecting data will be part of its purpose

    • timeon 23 minutes ago

      I still remember time when spyware was considered on same level as virus.

      Now it is just: 'we care about your privacy, allow us to sell it'.

mgh2 an hour ago

Not surprised, the question is: will the average buyer demographic care?

As seen with their social media products targeting less savvy consumers, will this product cross into the mainstream, after early tech savvy people (innovators)?

Mark is betting they wouldn't, like with other tech using their data (ex: Android).

  • ilrwbwrkhv an hour ago

    No. It's the boiling frog situation. The world will get shittier and shittier and everyone will complain but no one will stop it.

  • karaterobot an hour ago

    People will complain when the data is inevitably misused (I'm shocked! Shocked to find irresponsible data collection going on at Facebook!) but they won't do anything to prevent or avoid it in advance, if doing so gets in the way of them buying something that is sufficiently hyped.

zombiwoof 17 minutes ago

This crossed the red line. So meta can use passer by images of like me, to train their models when I want nothing to do with them

Simon_ORourke an hour ago

Of course they are, anything for a quick buck but it must be hinted at somewhere in the terms and conditions

  • rchaud an hour ago

    One yearns for simpler times when the buck could be made from the sale of $350-$500 glasses alone.

mattlutze an hour ago

Unique datasets are the moat for AI businesses. The data from these glasses is quite novel in it's perspective, context and contiguous-ness among other attributes.

Versus mobile phones, Meta are making a better go at, and seem to be the leader on, the bet that fundamentally ML/AI-driven glasses are going to be the next default modality for UX on the Internet.

cs702 an hour ago

"Probably?"

In what universe would Meta not use the data it collects to improve its AI models?

Most consumers don't care about privacy implications in the abstract, so they won't even think of asking Meta to stop.

Most tech people working with AI want Meta to continue to improve its open-source, open-weight Llama models, so they will be reluctant to ask Meta to stop.

rkahga an hour ago

Given that Facebook has been creating shadow profiles of non-users for a long time, it is not far fetched to assume that they will track the physical contacts of the Ray-Ban spy device wearer and record every interaction.

The current status regarding electronic devices is:

- If you have a pager or walkie-talkie, assume that it might blow up.

- If you have a smart phone, assume that it records your conversations.

- If another person has these RayBans, run don't walk.

More and more people know this. Even non-technical people are beginning to wake up.

BadHumans an hour ago

If you didn't expect this then I don't know what to say at this point. Meta must have hired a new PR agency because the amount of leeway and charitable interpretations I've seen given to this company recently is absurd given their track record.

theptip 32 minutes ago

ChatGPT does this too by default right?

DesiLurker 13 minutes ago

Duh! there is almost zero chance they are not. this is probably how they are selling such a big expense line item internally.

sub7 17 minutes ago

Despite all the money they've recently spent trying to rehab the founder's image, Meta's core DNA is built around invasion of privacy, dark silent opt-in patterns, abuse of user data etc.

I judge anyone who made their money there because they simply have made the world a much shittier place just by existing as some low quality 21st century nicotine dealership.

These glasses will 100% be used to ID, track, and ad bucket tag people without their consent. I'll be slapping them off anyone's face who looks in my direction and you should too.

isodev an hour ago

Is this Apple PR trying to stop people from buying Meta VR/MR hardware?

I mean it’s Facebook, we get it. But they make WhatsApp and they make affordable and actually working Quest and now glasses… I’m not going to get triggered by opinion posts just because the Vision Pro was a flop.

This would sound exactly the same if we say Apple is training their AI on everyone’s Photos and content from notifications (because Apple Intelligence).

nonrandomstring an hour ago

TFA has some entertaining descriptions of the userbase of low-IQ knuckle-draggers who need help choosing their clothes and can't remember where they parked their car. I'd prefer a more honest take on how most "glassholes" use this tech to find the names and addresses of "hot" passers-by they've recorded for their wank-bank. Wait till all this gets leaked from Meta (which is inevitable) and we see where the average users attention really dwells.