User not logged in - login - register
Home Calendar Books School Tool Photo Gallery Message Boards Users Statistics Advertise Site Info
go to bottom | |
 Message Boards » » Hey math nerds! Page [1]  
ThatGoodLock
All American
5697 Posts
user info
edit post

If I have a document with 7,500 different words and I want to know how many possible "phrases" I can make out of that document, to include single words so at least 7,500 base phrases...how would I got about doing that?

There's no set number of words for each phrase, it could be as short as 1 word (1 base phrase) or as long as 7,500 words long (also 1 phrase)

you can't repeat the same word when counting

you can make a phrase out of words that don't run next to each other (word 1, word 3, word 1114 = 1 phrase) but again, you can't repeat so there's no other combination of those exact 3 words that would amount to another phrase

god i hate math, this isn't for anything school-related, just trying to wrap my head around something...

11/22/2012 6:37:41 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

7500!

11/22/2012 6:39:42 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

ok now calculate using more than just the number of base phrases...

11/22/2012 6:41:33 PM

Noen
All American
31346 Posts
user info
edit post

Should be a factorial minus the difference of unique words from total words.

So (7500 !) - (7500 - unique words)

11/22/2012 6:44:41 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

nm

[Edited on November 22, 2012 at 6:46 PM. Reason : a]

11/22/2012 6:45:46 PM

Krallum
56A0D3
15294 Posts
user info
edit post

Lets do something large

I'm Krallum and I approved this message.

11/22/2012 6:46:51 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

summation (7500!)/(7500-n)! n=0..7500!

11/22/2012 6:47:36 PM

GrayFox33
TX R. Snake
10566 Posts
user info
edit post

42

[Edited on November 22, 2012 at 6:55 PM. Reason : Crallum]

11/22/2012 6:53:15 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

3.3249187468*10^25808

11/22/2012 7:02:11 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

how is it not
5.211710597023 E+25850

11/22/2012 7:03:05 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

whoops, i calculated 7,511!

11/22/2012 7:03:45 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

How did you get that?

11/22/2012 7:03:46 PM

GingaNinja
All American
7177 Posts
user info
edit post

(7500 P 1)+ (7500 P 2) + (7500 P 3) + ..... + (7500 P 7500) ??

11/22/2012 7:25:08 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

^ Which is what summation (7500!)/(7500-n)! n=0..7500! means.

11/22/2012 7:44:43 PM

lewisje
All American
9196 Posts
user info
edit post

tree fiddy

thou

11/22/2012 8:58:30 PM

ndmetcal
All American
9012 Posts
user info
edit post

11/22/2012 9:01:26 PM

qntmfred
retired
40371 Posts
user info
edit post

Hey!

11/22/2012 11:12:28 PM

GeniuSxBoY
Suspended
16786 Posts
user info
edit post

Quote :
"you can't repeat the same word when counting
"


If "is" is written 20 times in the document, are we only counting it once or is each "is" an individual word like is1, is2, is3, ... is20

11/22/2012 11:30:58 PM

Dentaldamn
All American
9974 Posts
user info
edit post

50/50

11/22/2012 11:42:45 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

^^ the second thing, word 40 can only be used once but if it's "the" it can appear several times

11/23/2012 12:12:06 AM

modlin
All American
2642 Posts
user info
edit post

Do the phrases have to make sense?

11/23/2012 7:38:29 AM

NCSUStinger
Duh, Winning
62329 Posts
user info
edit post

smath needs to get in on this

11/23/2012 8:00:12 AM

ThatGoodLock
All American
5697 Posts
user info
edit post

^^ i have a specific document in mind, but for purposes of this you can imagine that for whatever reason you'd want to combine random words into phrases

11/23/2012 12:01:40 PM

GrayFox33
TX R. Snake
10566 Posts
user info
edit post

What is the prize for this?

11/23/2012 12:03:41 PM

GeniuSxBoY
Suspended
16786 Posts
user info
edit post

I think smath just teaches elementary math.

11/23/2012 12:04:18 PM

JeffreyBSG
All American
10165 Posts
user info
edit post

Very close 7500!*e, which is about 3.324918748*10^25808, according to Maple.

First consider: how many ways can you make a phrase out of exactly k distinct words? Well, there are

binomial(7500,k)=7500!/((7500-k)!k!) ways of choosing k words. And there are k! ways of arranging each subcollection of k words, so there are

7500!/(7500-k)! phrases you can make out out of exactly k words. The number of words your phrase contains can be any k from k=0..7500, so your answer is

sum( 7500!/(7500-k)!, k=0..7500) = 7500! * sum(1/k!, k=0..7500).

Since sum(1/k!, k=0..infinity)=e, summing up to 7500 will get us pretty darned close to e. So our answer is very close to 7500! * e (although the vastness of the 7500! may make the difference between sum and limit significant; I don't know, offhand.)

oh wait, this is exactly what bbehe said


[Edited on November 23, 2012 at 4:28 PM. Reason : gtjeoi]

11/23/2012 4:24:13 PM

GrayFox33
TX R. Snake
10566 Posts
user info
edit post

Well, he does wear glasses.

11/23/2012 4:59:26 PM

Arab13
Art Vandelay
45166 Posts
user info
edit post

If you want the phrases to make sense then that dramatically reduces the number of combinations.

11/23/2012 7:20:51 PM

bbehe
Burn it all down.
18369 Posts
user info
edit post

^^ no pocket protector though.

11/23/2012 7:57:46 PM

bronco
All American
3942 Posts
user info
edit post

11/23/2012 8:22:16 PM

moron
All American
33720 Posts
user info
edit post

This was easy...

Calculate the closed form.

11/23/2012 9:07:27 PM

modlin
All American
2642 Posts
user info
edit post

If words 29 and 378 are both "ass"
And words 54 and 5012 are both "deep"

Can we make two valid phrases that are indentical? i.e "ass deep." and "ass deep."?

11/25/2012 11:58:00 AM

paerabol
All American
17116 Posts
user info
edit post

^
Quote :
"If I have a document with 7,500 different words and I want to know how many possible "phrases" I can make out of that document, to include single words so at least 7,500 base phrases...how would I got about doing that?

There's no set number of words for each phrase, it could be as short as 1 word (1 base phrase) or as long as 7,500 words long (also 1 phrase)

you can't repeat the same word when counting

you can make a phrase out of words that don't run next to each other (word 1, word 3, word 1114 = 1 phrase) but again, you can't repeat so there's no other combination of those exact 3 words that would amount to another phrase

god i hate math, this isn't for anything school-related, just trying to wrap my head around something..."

Quote :
"If I have a document with 7,500 different words and I want to know how many possible "phrases" I can make out of that document, to include single words so at least 7,500 base phrases...how would I got about doing that?"

Quote :
"If I have a document with 7,500 different words"

Quote :
"7,500 different words"

Quote :
"different words"

Quote :
"different"

11/25/2012 1:45:16 PM

oneshot
 
1183 Posts
user info
edit post

The answer is and has always been 42.

11/25/2012 2:43:08 PM

modlin
All American
2642 Posts
user info
edit post

^^
Quote :
"^^ the second thing, word 40 can only be used once but if it's "the" it can appear several times"

11/25/2012 5:04:21 PM

paerabol
All American
17116 Posts
user info
edit post

Then that would answer modlins question

11/25/2012 5:15:37 PM

BigEgo
Not suspended
24374 Posts
user info
edit post

12

11/25/2012 6:01:55 PM

modlin
All American
2642 Posts
user info
edit post

That part doesn't, but this:

Quote :
"you can't repeat so there's no other combination of those exact 3 words that would amount to another phrase
"


part does. I shoulda read closer.




So the way I understand it, you can make a three-word phrase of words:

1,2,3
1,2,4
1,2,n
1,2,n+1

as long as words 3,4,n, and n+1 are all unique.





So, you can't just write an equation to add up all the possible phrases you could make. You'd have to enter them all into a database and then have a computer go through and sequentially make each possible combination, and compare them to all previously made phrases to check validity.

11/25/2012 6:02:37 PM

paerabol
All American
17116 Posts
user info
edit post

I'm sorry for being an asshole modlin. I would that we continue our mutual lack of preconception.

11/25/2012 8:52:26 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

holy shit you people are still talking about this...

I was being coy at first because I thought people wouldn't care much but you people really are nerds so I'll explain further and see where it leads me...

So I'm trying to budget for hiring a programmer that can do the following:

Create a website where legislation can be uploaded by the user by pasting the full official text into one "catch all" box (or alternatively, several boxes based on pasting section by section, subsection by subsection, etc... and then labeling it all together in order)

So for example, this is the preamble to the US Constitution (ignore the tabbed format, it was before I realized you could make a database delimited by space alone)



I can write a basic program that can copy/paste this text into excel and then delimit it by space to create



Normally this would be displayed all on row 1 but for ease of explaining, i've fit all the text on the same screen in different rows

Now what I don't know how to do is and want to hire someone to do is then take this dataset and redisplay it on a webpage as "Document 1" (or whatever order it was uploaded, and in reality this would only be part of Document 1, the full Constitution) for the user so that it looks like normal text (as it was originally pasted in full, skip to the last image if you're confused) except that each word is now a selectable object (invisible boxes around each word?)

after it displays what appears to be normal text, the user can then select words on an individual basis to build what I'll term "blobs" (or "phrases" in my original question) like so (ignoring that my picture is not displayed like normal text would and uses the spreadsheet still)



So here, the words "promote the general welfare" which is made up of Objects 26-29, once selected will create Blob 1 (there needs to be a button for "create Blob" since you can select other than all at once) - colors reflect that a blob has been created and different colors represent different blobs



This image shows what would happen if each object just named were its own blob, so Object 26 = Blob 1, Object 27 = Blob 2, etc...



This image shows what would happen if every word was used as part of exactly one blob, so Objects 1-7 = Blob 1, Objects 8-15 = Blob 2, etc... (this is assuming the blobs were created in the same order in which they appear)



This is only one blob. It's just using a ton of word objects from all over the place in order to be created.



Similarly, this is also one blob. It's just that it is using two nonconsecutive groups of consecutive word objects.



Now let's go back to the original blob I showed, where we have created Blob 1, "promote the general welfare".
I want everything I just described (the upload of text and display back to the user to select) to occur in Window 1. What I'm going to describe next should occur in Window 2 (its all on the same screen, not separate programs, again skip to the last image real quick).

Every time the "create blob" button is pushed, and after that blob is created, I want Window 2 to display a chat window with a "post comment" box at the bottom of Window 2 where the user can leave a comment at the top of Window 2 that in theory is supposed to be about "promote the general welfare" and no other part. Each blob created has it's own separate chat thread and by clicking from blob to blob in Window 1, you can switch from one conversation to another in Window 2.

This is a display of all the different Windows I've designed, numbered from 1-5.
Ignore all the other stuff I haven't talked about yet.



So just based only on the activities that I've described so far, what's a fair flat fee programming price? or a fair estimation of hours at an hourly rate?

ps - im so hopped up on mountain dew right now, this is probably not the clearest explanation at all

11/25/2012 10:16:47 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

oh and in case anyone is wondering why the original question mentioned ~ 7,500



[Edited on November 25, 2012 at 10:20 PM. Reason : f]

11/25/2012 10:20:06 PM

paerabol
All American
17116 Posts
user info
edit post

man couple that up with an image-to-text converter on the front end for scanned or .pdf text sources and a direct interface to social media and RSS, with a lightweight document editor/exporter on the back and you've got a handy tool there

[Edited on November 25, 2012 at 11:05 PM. Reason : asdf]

11/25/2012 11:03:11 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

ive got plans for all that jazz in future iterations, but right now i'm just trying to get away with a minimum viable product

11/26/2012 12:58:04 AM

moron
All American
33720 Posts
user info
edit post

That doesn't seem too difficult, and shouldnt take too much time and could be done using existing open source libraries ( functionally just forum software).

Nailing down the GUI is the hard, time consuming part, but there are plenty of existing libraries to handle that type of text selection.

11/26/2012 1:35:27 AM

ThatGoodLock
All American
5697 Posts
user info
edit post

so when you say not too difficult can you attach some guess as to cost to hire? or hours to complete?

11/26/2012 1:31:39 PM

wdprice3
BinaryBuffonary
45908 Posts
user info
edit post

working on genassem, eh?

11/26/2012 1:40:30 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

you know it!

if anyone wants to contribute to further funding or see a funky video I made
http://search.voltcrowd.com/campaign/detail/436

11/26/2012 4:47:54 PM

David0603
All American
12762 Posts
user info
edit post

Can the blobs overlap?
Is existing db infrastructure for this already set up?
Maybe I'm underestimating "existing open source libraries ( functionally just forum software)" but this seems far from "not too difficult"

11/26/2012 5:41:04 PM

ThatGoodLock
All American
5697 Posts
user info
edit post

yes, they can overlap and you can also have blobs within existing blobs or vice versa

nothing is setup for this. again, i'm looking for someone to code just a minimum product I can showoff in order to get further funding - so even if it doesn't actually work in a multi-user environment or even online so it's standalone at first, if it can display the single user experience that i've described above i'll be happy for now

11/26/2012 6:10:31 PM

 Message Boards » Chit Chat » Hey math nerds! Page [1]  
go to top | |
Admin Options : move topic | lock topic

© 2024 by The Wolf Web - All Rights Reserved.
The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.
Powered by CrazyWeb v2.38 - our disclaimer.