Talk:Prime factorization algorithm

Can someone give me the python script which returns

$ python factorize.py 1800
[2^3, 3^2, 5^2]

instead of

$ python factorize.py 1800
[2, 2, 2, 3, 3, 5, 5]

import sys,math
def factorize(n):
    def isPrime(n):
        return not [x for x in xrange(2,int(math.sqrt(n))+1)
                    if n%x == 0]
    primes = []
    candidates = xrange(2,n+1)
    candidate = 2
    while not primes and candidate in candidates:
        if n%candidate == 0 and isPrime(candidate):
            primes = primes + [candidate] + factorize(n/candidate)
        candidate += 1            
    return primes

def condense(L):
  prime,count,list=0,0,[]
  for x in L:
    if x == prime:
      count = count + 1
    else:
      if prime != 0:
        list = list + [str(prime) + '^' + str(count)]
      prime,count=x,1
  list = list + [str(prime) + '^' + str(count)]
  return list

print factorize(int(sys.argv[1]))
print condense(factorize(int(sys.argv[1])))

$ python factorize.py 1800
[2, 2, 2, 3, 3, 5, 5]
['2^3', '3^2', '5^2']

small bugfix[edit]

The code above, as written, would work; however, this is partly lucky. The isPrime function was inaccurate, as range doesn't include the higher end, so e.g. if checking for primality of 9, it would try numbers from 2 to 2, and conclude it was prime. I've added 1 to the upper end of the range so that the isPrime function works, in case anyone else comes along and tries to use it. The overall function worked only because it only picked off the smallest prime factor each time (and it could be sped up by not trying smaller prime factors than were found last time). --PsyMar (talk) 11:04, 2 March 2016 (UTC)[reply]

Wait, is xrange a thing in python2 maybe? I had to change it to range to make it work under python3, maybe this was the problem. --PsyMar (talk) 11:06, 2 March 2016 (UTC)[reply]

This comment is just to show up in the history, as I accidentally marked my "bugfix" comment minor. --PsyMar (talk) 11:07, 2 March 2016 (UTC)[reply]

some improvements of the code[edit]

The current sqrt() algorithm has limitations due to recursion. I've written a new one based on Square root that should run faster and accept larger values. Then I have rewritten the factorize() algorithm. The performance improvements here are marginal, I've just rewritten it for learning purposes. The source is available at [1]. Feel free to use it as you want.

C++ Example[edit]

If were going to have examples shouldn't they actually work? The C++ example wouldn't even compile, unless their is a new version of C++ that doesn't use namespaces or require an int return from main()? If you fix those minor errors however the example seesm to be putting out nonsense; id est, giving the same primes for output no matter what the input. I wish life were that easy. :D (I didn't check the logs, but I'm thinking this was originally a C example someone haphazardly converted to C++, or someone has a goat for a compiler.)

I'm putting it here until I or someone else can fix it.

#include <iostream>
#include <cmath>
#include <stdlib.h>

using std::cout;
using std::endl;

int isPrime(int n)
{
  for(int ind=2;ind<=sqrt(n);ind++) {
    if (n%ind==0) {
      return true;
    }
  }
  return false;
}

int main(int argc, char *argv[])
{
  // check argc to make sure we recieved a value.

  int n = atoi(argv[1]);

  for(int i=2;i<=n;i++)
  {
    while ( !isPrime(i) && n%i == 0 )
    {
      cout << i << " ";
      n=n/i;
    }
  }
  cout << endl;

  return 0;
}

I agree. This code is not useful. Seems like it could also use some cleanup as to the formatting. JamesJNHu (talk) 16:39, 23 September 2014 (UTC)[reply]

A small contradiction[edit]

I think there's a small contradiction between the paragraph "description" and "example". The descriptions says that if n is prime, then you have the factorization and you stop there; but in the example, when n become prime (i.e when n = 13), it's still testing if n can be divided cleanly by the next primer number (11). So, maybe someone should think about changing a little bit the example (must say it's a good example but like i try to explained (i'm french) it's a bit in contradiction with the description of the algorithm).

Oh and by the way, the C++ example is a bit funny, i don't see any object-oriented programmation in this, i wonder why it wasn't write in C... :)

Etienne —The preceding unsigned comment was added by 69.159.129.147 (talk) 20:31, 5 March 2007 (UTC).[reply]

It's sometimes convenient to use C++ just for the standard classes, like vector or iostream. CjPuffin 00:15, 6 March 2007 (UTC)[reply]

idea[edit]

Hello,
I wonder if it wouldn't be easy to factorize primes by something like tries or binary search.
It seems to me the immediate thought on how to solve this, but I don't realize why it shouldn't work, so please tell me.

let's say I have a database of prime numbers and the steps (1->2 2->3 3->5 4->7...)
let's say I have some prime1*prime2 result. (p*q)
I do result sqrt, and find a place to start searching from.

so I take the target's sqrt, and go down one step to some n (id of floor is n). I try n*n+1
and then n*n+2, n*n+3... until result is bigger than my target.
 if closest multiplication is not the target, 
   we do n-1 * n, n-1 * n+1, n-1*n+2... until result is bigger than target.

(n is the step in db not the prime itself)
and so on, until n=2.
if mod is more efficient than these queries, we can check for clean division, since we have the result: mod (result/n)
 if 0 we check if the answer is in db.
     if yes, we found our primes.
     if not: go down one step to prime(n-1).
 else
   go down one step to prime(n-1).
try again result/prime(n-1) -> if in db.. if not.. n-2.
this needs a direct match, but since we started in the square root there shouldn't be so many. we can also try binary search (jump a certain number, and minimize search by half each time) so we find the expected range faster, and minimize on the exact point.
it should be pretty trivial to create a little script and check if it works.
it seems like a childish solution and since even encryption algorithms rely on prime factorization it's hard to imagine that it'll work, but maybe I'll try just to find out why or if it fails.
do you know why?

Why it is so hard to factor The most secure RSA cyphers rely on 2048 bit numbers. That is 2²⁰⁴⁸ = 323170060713110073007148766886699519604441026697154840321303454275246551 388678908931972014115229134636887179609218980194941195591504909210950881 523864482831206308773673009960917501977503896521067960576383840675682767 922186426197561618380943384761704705816458520363050428875758915410658086 075523991239303855219143333896683424206849747865645694948561760353263220 580778056593310261927084603141502585928641771167259436037184618573575983 511523016459044036976132332872312271256847108202097251571017269313234696 785425806566979350459972683529986382155251663894373355436021354332296046 45318478604952148193555853611059596230656. This is about 10⁶¹⁷. I used python to calculate the above number. It is seriously that big. Currently all known Integer_factorization algorithms are slow as dirt when trying to factor the above number. I'm no expert in prime numbers, but even if there was only one prime in a million (1000000), you can clearly see how many primes there would be. It is not practically possible to calculate all the prime numbers, nor do we have the disk space to store them for fast access. BTW, if someone breaks the above number, the key just has to be made longer. A 4096 bit key would be 2²⁰⁴⁸*2²⁰⁴⁸. The only thing conceivable to break RSA and make it obsolete is to come up with a fast factoring algorithm or make a quantum computer.

A better algorithm[edit]

I'm a senior university student in computer science and I know a bit about this topic. The big algorithms are very complex and difficult to read. So I decided to post a much simpler algorithm. After some very careful study, I don't think this algorithm is any slower then the big one posted on the main page. It could even be faster...

#Python code to factor a number
#This code can factor any number of any length... eventually.
#This is because python has built in arbitrarily length numbers.
def factor(x):
    factors = []
    i = 2
    while x > 1:
        if x % i == 0:
            x = x / i
            factors.append(i)
        else:
            i += 1
    return factors

This algorithm attempts to divide the number in question by all positive integers starting at 2. Although the algorithm will admittedly try to pull out factors of non prime numbers, it is guaranteed not to.

(I believe the following theory is correct. Let me know if I'm mistaken because I just came up with this to simplify that massive algorithm.)

Proof: Lets assume that the algorithm tries to factor out a non prime number, say, 15. Because 15 is not prime, it will have factors less then 15, namely 3 and 5. Trying to factor out 15 is the same as trying to factor out 3 and 5 at once. However, because 3 and 5 are less then 15, the algorithm would have already factored them out. This means that the number 15 will never be factored by the algorithm.

In general any composite number will have factors less then the composite. Because all factors of the composite will have been removed from the number being factored before hand, the composite will not factor out. Therefor, the algorithm can only succeed in factoring out prime numbers. :D

Assuming the above prof is correct, the original number will have all of its factors removed and be reduced to 1. When that happens the algorithm terminates. This more or less proves that the algorithm will halt when the number is factored and not keep going on forever.

A Bug

one more comment on the above algorithm, when try 1234567890987654321 it output: [3, 4, 4, 4, 17, 29, 431, 463, 2557, 25561], where 4 is not prime. — Preceding unsigned comment added by 128.120.52.135 (talk) 21:10, 27 August 2013 (UTC)[reply]

Solution: For Python 3 use x = x // i instead of x = x / i. — Preceding unsigned comment added by 95.105.176.163 (talk) 21:00, 11 December 2013 (UTC)[reply]

This is the trial division algorithm. The version you describe, on a b-bit number, may take as many as 2^b iterations (for instance, it will do so whenever its input is prime). It may be sped up to roughly 2^b/2 by stopping and giving up when i is around the square root of the remaining value x, but this is still much much slower than state-of-the-art algorithms. In any case, if you actually had a new algorithm, rather than a reinvention of the wheel, this would not be the place to promote it. This talk page should be for discussion of improvements to the Wikipedia article only, not for discussion of improvements to the topic that the article is about. —David Eppstein (talk) 21:06, 11 December 2013 (UTC)[reply]